Win a copy of Re-engineering Legacy Software this week in the Refactoring forum
or Docker in Action in the Cloud/Virtualization forum!
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic

Multiple XML files from a Flat File To Individual xml files

 
Pradeep Seth
Greenhorn
Posts: 1
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi,

I am getting a flat file, ex: doc,pdf,txt, that file contains Multiple xml files.
Now I want to split those xmls in to individual xml files.

INPUT - My Flat test.doc, file having two xml.

<?xml version="1.0" encoding="UTF-8"?>
<bookstore>
<book category="COOKING">
<title lang="en">Everyday Italian</title>
<author>Giada De Laurentiis</author>
<year>2005</year>
<price>30.00</price>
</book>
<book category="CHILDREN">
<title lang="en">Harry Potter</title>
<author>J K. Rowling</author>
<year>2005</year>
<price>29.99</price>
</book>
<book category="WEB">
<title lang="en">Learning XML</title>
<author>Erik T. Ray</author>
<year>2003</year>
<price>39.95</price>
</book>
</bookstore>
<?xml version="1.0" encoding="UTF-8"?>
<bookstore>
<book category="COOKING">
<title lang="en">Everyday Italian</title>
<author>Giada De Laurentiis</author>
<year>2005</year>
<price>30.00</price>
</book>
<book category="CHILDREN">
<title lang="en">Harry Potter</title>
<author>J K. Rowling</author>
<year>2005</year>
<price>29.99</price>
</book>
<book category="WEB">
<title lang="en">Learning XML</title>
<author>Erik T. Ray</author>
<year>2003</year>
<price>39.95</price>
</book>
</bookstore>

OUTPUT - Now the Java program will split into two xml files ex : xmlfile1.xml & xmlfile2.xml

Please help me.
 
Marco Ehrentreich
best scout
Bartender
Posts: 1294
IntelliJ IDE Java Scala
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
You could simply read the original file line by line and and start a new XML file when you reach an XML preamble like <?xml ...>. This shouldn't be too hard.

Marco
 
Ulf Dittmer
Rancher
Pie
Posts: 42967
73
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
DOC and PDF are not "flat files", they're structured binary formats that can not be read as easily as text files. How would they contain XML files?
 
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic