My requirement is as below:
We save template values in the html tag like <table><tr><td>Name</td></tr></table>
in MsSQL database.
Now, I need to write a java code that can read each record as mentioned above and the ultimate
out put must be in individual word document for individual record.
Another challenging task, code should be capable to read more than six lakh records
form the database table at a stretch .
So you need to extract data from HTML and write it to a Word document; is that correct? If so, you could use a library like NekoHTML or TagSoup to access the data in the HTML file, and then use the POI library to create a Word file. Here's an example of how to do the latter: http://faq.javaranch.com/java/CreateWordDocument
If the HTML data isn't a full document, but really just a snippet like the one you posted above, then a regular expression, or a series of string operations, should be sufficient to retrieve the data.
Did you read the FAQ that Ulf posted earlier? Did you try the code out to see how it works? What do you have so far?
When you do things right, people won't be sure you've done anything at all.
Joined: Dec 30, 2008
thanks for your reply.
My requirement is slightly different. Let me clarify again.
I need to extract data from HTML into word document as well as the
look and feel of the word document must be same as with the HTML contents
in the browser.
Another challenging part, code must be compatible enough to read more than
six lakh record from database at a stretch.
Joined: Mar 22, 2005
If it's full documents that you're converting, then the JODConverter library may be for you. It requires OpenOffice to be installed, but it sounds as if this is to be done on a server, so that shouldn't be a problem.
I don't understand how "compatibility" (of what?) should have an impact on performance; maybe you can elaborate.