• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Jeanne Boyarsky
  • Ron McLeod
  • Paul Clapham
  • Liutauras Vilda
Sheriffs:
  • paul wheaton
  • Rob Spoor
  • Devaka Cooray
Saloon Keepers:
  • Stephan van Hulst
  • Tim Holloway
  • Carey Brown
  • Frits Walraven
  • Tim Moores
Bartenders:
  • Mikalai Zaikin

How to remove carriage return and linefeeds from XML files

 
Ranch Hand
Posts: 127
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hello,

What would be an elegant way to remove carriage return / linefeeds from an XML file?
I have a byte array (or String) containing an XML file that, when printed to output, spans multiple lines because every node is postfixed with the CR/LF characters.

I would rather not use String.replaceAll(..) because possibly the data itself in the XML might deliberately contain CR/LF characters.
So I am looking for a way to 'intelligently' remove the CR/LF chars between the nodes.

I thought of using SAX parsing to read the elements from top to bottom and 'rebuild' the XML content that way.
But there must be a simpler way to do this?

Cheers!
Kjeld
 
Marshal
Posts: 28177
95
Eclipse IDE Firefox Browser MySQL Database
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Personally I would use XSLT for this. An identity transformation, decorated by something which ignored text nodes which were all whitespace. Perhaps just an <xsl:strip-space> element would do it.
 
Kjeld Sigtermans
Ranch Hand
Posts: 127
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Ok I had not thought of that, and I think I know how to do that, but I need it to really perform (in a non-time-consuming matter).
Isn't a XSLT transformation in Java known to be relatively 'slow'?
 
Bartender
Posts: 10336
Hibernate Eclipse IDE Java
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
It can be, but usually only when the XML gets quite big.

Any reason you need to do this? Such white space is meaningless in XML after all.
 
Kjeld Sigtermans
Ranch Hand
Posts: 127
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Well I am trying to put a java.util.Properties object in to a Sql Server XML column. Properties.storeToXML(..) generates a nice XML representation of the properties object and I got that working.
There's no need to do the conversion for that task, but in some cases I do want to output that XML representation to a log line. I thought it would be nice to just put it on one line.
 
Paul Clapham
Marshal
Posts: 28177
95
Eclipse IDE Firefox Browser MySQL Database
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Kjeld Sigtermans wrote:Isn't a XSLT transformation in Java known to be relatively 'slow'?



Let me just reference my brother Knuth's comment about "premature optimization" for the N-th time. If you really haven't heard it before then a Google search will find it for you.

Anyway you asked for "elegant" as your primary requirement.
 
Kjeld Sigtermans
Ranch Hand
Posts: 127
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Alright.

This is my code, without any 'premature optimization' (had to look it up, and I don't agree, at least not in this context).
But I went with the XSLT solution.

The XSL file xmlFormatter.xsl:

Obviously, at this point the code is eligible for optimization.
I think making the choice of whether or not to go for XSLT is an architectural decision and not a premature form of optimization. Indeed, once we have chosen XSLT we should probably first get it to work and then we can optimize all we want.
I assumed high performance to be almost always an obvious requirement. Maybe I should have been more clear and have said: elegant as well as fast. But then again I still don't think the XSLT solution hereabove is elegant... and requirements change all the time.

Thanks,
Kjeld
 
Paul Clapham
Marshal
Posts: 28177
95
Eclipse IDE Firefox Browser MySQL Database
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Kjeld Sigtermans wrote:I assumed high performance to be almost always an obvious requirement. Maybe I should have been more clear and have said: elegant as well as fast. But then again I still don't think the XSLT solution hereabove is elegant... and requirements change all the time.



High performance is not always a requirement. Sometimes you need a quick and dirty program to do something
once. If it takes 10 minutes instead of 10 seconds you don't really care. But you dismissed XSLT as "not fast" just
based on some rumours or vague opinion. That statement qualifies as "premature optimization". I'm willing to bet
(or at least consider the possibility) that the XSLT solution is going to be similar in performance to whatever you put
together in a DOM.

Personally I think that writing an extension of an identity template is far more elegant than writing some DOM code
to implement the rules. But I'm a mathematician so I use the mathematician's definition of "elegant". There is almost
no DOM code which I would consider "elegant".

And yes, you do have to choose whether you want to include XSLT in the set of languages which you want to have
in your environment. If you want to reject it because it's yet another language to learn, then you could certainly do
that (and call it an architectural decision). I find it to be a useful tool myself.
 
With a little knowledge, a cast iron skillet is non-stick and lasts a lifetime.
reply
    Bookmark Topic Watch Topic
  • New Topic