• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Jeanne Boyarsky
  • Ron McLeod
  • Paul Clapham
  • Liutauras Vilda
Sheriffs:
  • paul wheaton
  • Rob Spoor
  • Devaka Cooray
Saloon Keepers:
  • Stephan van Hulst
  • Tim Holloway
  • Carey Brown
  • Frits Walraven
  • Tim Moores
Bartenders:
  • Mikalai Zaikin

Couldn't convert HTML to XML file due to Java I/O issue

 
Ranch Hand
Posts: 235
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hi All,

I am having difficulty saving a complete ABC.html file (from certain website. e.g. www.abc.com) in time so that it could be used to convert to ABC.xml format (a combination of Saxon and TagSoup parser). This issue may have been due to the premature closing of ABC.html prior to reading is completed by XML conversion tools(light_html2xml or Saxon with TagSoup).



Note that ABC.html has been created successfully. However, the question is at what stage that it has been completely written and when was it being read by the conversion tool?

There is no problem with either of the conversion tools and I have used them extensively including in the former program. More importantly, this issue was posted to many different forums (http://forums.sun.com/thread.jspa?threadID=5343084, https://coderanch.com/t/129985/XML/Cannot-close-XML-file-used, http://www.stylusstudio.com/xmldev/200810/post40120.html, http://www.stylusstudio.com/xmldev/200810/post50120.html) but this symptom has indicated that it is an I/O issue as opposed to anything else.

This issue has plagued me for months where I thought the problem was from the XML conversion tool (light_html2xml or Saxon with TagSoup).

I have exhausted every effort but could not find a solution still.

The above programs are running on JDK 1.6.0_06, Netbeans 6.1, JDom 1.1, Saxon 6.5.5, TagSoup 1.2 on Windows XP.

This question has also been posted on http://forums.sun.com/thread.jspa?threadID=5346755
Any assistance would be much appreciated.

Many thanks,

Jack
 
Rancher
Posts: 43081
77
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
You should flush and close any stream that's related to writing a file, before you start reading from that file.
 
Jack Bush
Ranch Hand
Posts: 235
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hi Ulf,

You were spot on about this issue!

The reading of ABC.html is now working after having added the flush and closing upstream.

Thank you very much,

Jack
 
reply
    Bookmark Topic Watch Topic
  • New Topic