aspose file tools*
The moose likes Java in General and the fly likes Special Characters Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login
JavaRanch » Java Forums » Java » Java in General
Bookmark "Special Characters" Watch "Special Characters" New topic
Author

Special Characters

Rajkumar Katudia
Ranch Hand

Joined: Jul 28, 2009
Posts: 51
I am trying to extract data from HTML file and write it to XML file.

Everything is fine, just that special characters are creating an issue.

< and > get converted to & lt; and & gt;

However, I expect < and > instead of & lt; and & gt;

Please help me resolve this.

THE CODE:
Rajkumar Katudia
Ranch Hand

Joined: Jul 28, 2009
Posts: 51
The issue is with the tag where I try to create CDATA.

The < and > get converted to & lt; and & gt;

Please suggest how do I convert them back to < >
Rob Spoor
Sheriff

Joined: Oct 27, 2005
Posts: 19723
    
  20

String.replace?


SCJP 1.4 - SCJP 6 - SCWCD 5 - OCEEJBD 6
How To Ask Questions How To Answer Questions
Rajkumar Katudia
Ranch Hand

Joined: Jul 28, 2009
Posts: 51
Rob Prime wrote:String.replace?


You mean String.replace("& gt;",">");

???
Rajkumar Katudia
Ranch Hand

Joined: Jul 28, 2009
Posts: 51
I guess my question is not quite clear. I'll try and re-frame it.

Here's what I am trying to do:
I am trying to write an XML doc with the code present in First post.
I am creating Elements and then setting text for each element.
The text for one of the elements has tags. Basically HTML content.

So the tag with the content should look like this:
<body>
<br>
<br>
Some text sample text dummy text. Trial Text. ahkjsj Text
<p>
dsfsdf jkkjs kifsdfko kmlsmdkfmiusdfyugsd y deser cybvubu ij njnknmlkm
<table>
<tr>
<td>
Column 1
</td>
<td>
Column 2
</td>
<td>
Column 3
</td>
</tr>
</table>
</body>


However

The code outputs the content as:

<body>
& lt;br& gt;
& lt;br& gt;
Some text sample text dummy text. Trial Text. ahkjsj Text
& lt;p& gt;
dsfsdf jkkjs kifsdfko kmlsmdkfmiusdfyugsd y deser cybvubu ij njnknmlkm
& lt;table& gt;
& lt;tr& gt;
& lt;td& gt;
Column 1
& lt;/td& gt;
& lt;td& gt;
Column 2
& lt;/td& gt;
& lt;td& gt;
Column 3
& lt;/td& gt;
& lt;/tr& gt;
& lt;/table& gt;
</body>

Now, after I create the XML, I parse it again to extract data and put it into a HTML. Now, Since the < and > tags have been replaced by & lt; and & gt;, The parser fails.

So, I need a way to instruct the method at Line 117 (message 1), not to replace the < and > with & lt; and & gt;.

How do I do this?

Please help.

Thanks in advance.

Raju katudia.
Rajkumar Katudia
Ranch Hand

Joined: Jul 28, 2009
Posts: 51
Please help me how do I do this???
 
I agree. Here's the link: http://aspose.com/file-tools
 
subject: Special Characters