This week's book giveaway is in the Servlets forum.
We're giving away four copies of Murach's Java Servlets and JSP and have Joel Murach on-line!
See this thread for details.
The moose likes XML and Related Technologies and the fly likes Utiltiy that uses Terse XML tags Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login


Win a copy of Murach's Java Servlets and JSP this week in the Servlets forum!
JavaRanch » Java Forums » Engineering » XML and Related Technologies
Bookmark "Utiltiy that uses Terse XML tags" Watch "Utiltiy that uses Terse XML tags" New topic
Author

Utiltiy that uses Terse XML tags

bob connolly
Ranch Hand

Joined: Mar 10, 2004
Posts: 204
Hello,

Has anyone run accross a utility to convert XML tags to low memory sized cryptic characters?

I'm hoping to find a utility that will read a BILLION record XML file and convert the tags to a special byte size character to reduce file size!

This utility would do something like convert <intex_traunche_cusip_id> to somthing like #@ or somekind of unmodifyable crytptic low size value?

Thanks for any references or suggestions!

bc
William Brogden
Author and all-around good cowpoke
Rancher

Joined: Mar 22, 2000
Posts: 12759
    
    5
Actually that would be pretty trivial to program using SAX. The startElement and endElement methods would have to build and use a replacement table. However, before you embark on that you should look into how much compression the plain ZIP compression utility can provide.

I looked into both ZIP encoding and "fast infoset" for this article. ZIP encoding compressed my test file by more than a factor of 10 with only a minor effect on parsing time.

Let us know what you come up with, I think a lot of people are worried about large XML files.

Bill
bob connolly
Ranch Hand

Joined: Mar 10, 2004
Posts: 204
Thanks William, very good article!

I'm going to take a closer look into the specifications for that Fast Infoset technique!

And it's good to know that the zipping is about the best anyone can do for right now!

Have a good one William!

bc
 
jQuery in Action, 2nd edition
 
subject: Utiltiy that uses Terse XML tags
 
Similar Threads
Java Reflection Problem
marshalling and unmarshallin
marshalling
XML Parser
Can I convert a Jsp to XML