aspose file tools*
The moose likes XML and Related Technologies and the fly likes 	 Removing tab character(^I) and null character Unicode: 0x0 from configuration XML file. Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login
JavaRanch » Java Forums » Engineering » XML and Related Technologies
Bookmark "	 Removing tab character(^I) and null character Unicode: 0x0 from configuration XML file." Watch "	 Removing tab character(^I) and null character Unicode: 0x0 from configuration XML file." New topic
Author

Removing tab character(^I) and null character Unicode: 0x0 from configuration XML file.

Himanhsu Yadav
Ranch Hand

Joined: Sep 26, 2007
Posts: 33
I am struggling to remove these junk characters from my XML.
What is the best way to it? I have already tried java programs and some unicode editors.

Please help.

Thanks
Rob Spoor
Sheriff

Joined: Oct 27, 2005
Posts: 19761
    
  20

If they really are tab and the NULL character, in Java these are '\t' and '\0'.


SCJP 1.4 - SCJP 6 - SCWCD 5 - OCEEJBD 6
How To Ask Questions How To Answer Questions
Himanhsu Yadav
Ranch Hand

Joined: Sep 26, 2007
Posts: 33
I am getting SAXParser error for Unicode: 0x0. I think this is the null character. Right?
Campbell Ritchie
Sheriff

Joined: Oct 13, 2005
Posts: 39834
    
  28
Too difficult for a "beginning" question. Moving thread.
William Brogden
Author and all-around good cowpoke
Rancher

Joined: Mar 22, 2000
Posts: 12823
    
    5
This sounds like a job for some sort of custom extension of java.io.FilterInputStream that would sit between your source and your XML parser while deleting illegal characters.

What is the source of this XML? Do you have any idea why it mixes these illegal characters in?

Bill
Himanhsu Yadav
Ranch Hand

Joined: Sep 26, 2007
Posts: 33
I am sure not how these characters coming into my configuration XML. One common pattern I noticed that if I edit XML in Weblogic Workshop this problem comes. So what is the solution? Read this XML in java class, remove all the characters and again write it in file?
Jimmy Clark
Ranch Hand

Joined: Apr 16, 2008
Posts: 2187
Sounds like you should stop using Workshop to edit the files.
Himanhsu Yadav
Ranch Hand

Joined: Sep 26, 2007
Posts: 33
I have stopped using it but how about existing files.
Jesper de Jong
Java Cowboy
Saloon Keeper

Joined: Aug 16, 2005
Posts: 14347
    
  22

Are you reading the XML files using a different character encoding setting than what the actual character encoding of the files is?

Make sure that if a file is for example encoded using UTF-8, you're reading it as an UTF-8 file. If you'd use a wrong character encoding to read the file, you could get strange errors like you describe.


Java Beginners FAQ - JavaRanch SCJP FAQ - The Java Tutorial - Java SE 8 API documentation
Himanhsu Yadav
Ranch Hand

Joined: Sep 26, 2007
Posts: 33
XML is read by JAXB SaxParser while deploying the ear in WebLogic server. I am using VI editor to remove ^I but not able to identify null characters.
Himanhsu Yadav
Ranch Hand

Joined: Sep 26, 2007
Posts: 33
I am still waiting for its resolution. Please help.
W Fay
Greenhorn

Joined: Feb 17, 2010
Posts: 3
Maybe you should run your existing files through some kind of filter program like the one suggested here so the bad characters are removed...
 
I agree. Here's the link: http://aspose.com/file-tools
 
subject: Removing tab character(^I) and null character Unicode: 0x0 from configuration XML file.