• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Jeanne Boyarsky
  • Ron McLeod
  • Paul Clapham
  • Liutauras Vilda
Sheriffs:
  • paul wheaton
  • Rob Spoor
  • Devaka Cooray
Saloon Keepers:
  • Stephan van Hulst
  • Tim Holloway
  • Carey Brown
  • Frits Walraven
  • Tim Moores
Bartenders:
  • Mikalai Zaikin

Avoiding XML escaped values (e.g. >)

 
Greenhorn
Posts: 5
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hi there,

I'm using the following snippet to write a DOM document to a file. The DOM document contains some text elements, which in turn contain some formulas (e.g. a => b). The underlying xml parser, very kindly escapes some values, (e.g. turns '>' into '& gt;' ) . Is there a way (different than using CDATA) to avoid this? The StreamResult class provides PI_DISABLE_OUTPUT_ESCAPING, but I haven't figure it out yet.

Your advice/wisdom will be greatly appreciatted.

Cheers,

h.coder


 
author
Posts: 30
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
You haven't said if you expect your output to be in XML or text format. Also, you don't show a stylesheet for the transformer to use, so it's hard to be sure what you have in mind.

If you want to get an XML output, then the special characters MUST be escaped, as it appears they are.

If you want to get text-only output, you want to use

<xsl utput method='text'/>

in your XSLT stylesheet (which isn't specified in your example). With the text output method, those characters will not be escaped.
 
Harold Coder
Greenhorn
Posts: 5
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hi there,

Thank you for your replies. However, I think I should explain better the scenario.

1- My XML file containts logic formulas encoding a domain specification for a planning application. In other words, you could look at the content as some sort of program. E.g.



2- There is no available DTD or schema or XSL for the XML file. However, I must comply with its ad hoc structure, as it is part of the "syntax" of the planning application.

3- I need to dinamically modify the XML file to customize the input to the planning application. That is, I need to read the file, modify the content of some elements (add/remove formulas) and write it down.

So far so good, the problem is that at WRITING TIME the parser escapes all the ">" characters, screwing all logical implication formulas (i.g. the ones that contain "=>").

This is the code snippet I am currently using to write my DOM document to file. Thoughts anyone? Thank you in advance,

hc

 
Tom Passin
author
Posts: 30
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
I'm still not clear on a few key points -

1) Is your output going to be pure text, or is it going to be xml with the logic formulae contained in the xml elements?

2) What application or code will consume your result in the next stage of processing?

The answer to 1) is crucial, for this reason - you MUST escape the "<" characters in character content (though you can surround a formula in a CDATA section instead of escaping the character individually). On the other hand, of you are creating pure text, no escapes are needed.

If you are creating xml output, the processor in the next stage should unescape all characters as needed, but if you are extracting the content using, say regular expressions then of course that wouldn't happen.

If you want to create plain text output, just set the method attribute in xsl utput to "text" in your xslt stylesheet, and the results won;t be escaped.

As for the "PI_DISABLE_OUTPUT_ESCAPING" constant, I have never used it but from the javadocs, it looks like the processor just inserts a processing instruction, which might be understood by some downstream code but not by most. That's probably a red herring in your case.

You said "the problem is that at WRITING TIME the parser escapes all the ">" characters". Actually, the parser does not create the output. The transformer is doing the output. The parser is the code that reads the xml document and analyzes it to find its structure.

[BTW, Harold, you sent me a private message but your profile does not allow you to receive private messages so I could not reply except here]
 
Don't get me started about those stupid light bulbs.
reply
    Bookmark Topic Watch Topic
  • New Topic