File APIs for Java Developers
Manipulate DOC, XLS, PPT, PDF and many others from your application.
The moose likes Servlets and the fly likes HTML Parsing inside a Servlet Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login
JavaRanch » Java Forums » Java » Servlets
Bookmark "HTML Parsing inside a Servlet" Watch "HTML Parsing inside a Servlet" New topic

HTML Parsing inside a Servlet

Anoop Krishnan
Ranch Hand

Joined: May 03, 2001
Posts: 163
I am devolopping a web tool in which i have to keep the HTML template seperate from logic.So i have devolopped some model as shown below.
I am giving a sample content of an html template which i use for my tool
Thank you for purchasing the item
Total Price: <!PURCHASE_PRICE>
in the above sample <!USERNAME>,<!PURCHASE_PRICE> is a variable identifier which will be each time replaced with a value and will be send to the browser by my JSP or SERVLET.Not only this i have to convert the format of the numbers according to the region or languge which the user is choosing so the out put will be
Hai George
Thank you for purchasing the item
Total Price: 50.00 $
</html> in ENGLISH
Hai George
Thank you for purchasing the item
Total Price: 100,00 DM
</html> in GERMAN languages
Currently i consider the entire HTML page as a String object and finds these tags via indexOf() method and replace them using substring() method and uses Java Internationalization classes for formatting the locale specific information
But this kind of parsing of HTML file seems to be very slow in performance.

I just want to know is there any body call my bean's Getter and Setter methods with "Please" in front - My favorite quip from Bugzilla
William Brogden
Author and all-around good cowpoke

Joined: Mar 22, 2000
Posts: 13035
This sounds like a job for (ta-da!) JavaServer Pages.
However, if you want to do it in a servlet, here is the way I
have done something similar. I create an object that hangs on to the template like this:
1. Grab the entire HTML template.
2. Cut it up into smaller Strings between your specialty tag points. You end up with an array of Strings representing the plain text, and an array of tags that matches.
3. Your output method just marches through these arrays alternating:
a. output plain HTML
b. output specialty tag info
This way you only parse the template once.

author of:
Peter den Haan
Ranch Hand

Joined: Apr 20, 2000
Posts: 3252
Alternatively... Your biggest problem is that, after generation of the HTML, you have to scan the entire generated document for your tags at run time. Instead of this, you could use tag libraries to create custom JSP tags. Your tag would essentially be replaced by calls to the taglib code at compile time. Much more efficient.
- Peter

[This message has been edited by Peter den Haan (edited May 14, 2001).]
David O'Meara

Joined: Mar 06, 2001
Posts: 13459

I can hardly say I'm an expert at it yet, but I highly recommend Cocoon or any other XSL transformation processor for dynamic pages. In 20 words or less, it allows you to define your data as an XML document then defines an XSL document which maps from the XML to a HTML doc. (25 words, oops)
From experience there is always a problem once you include even a single line of code in HTML since it can no longer be altered/updated by creatives. It must now be maintained by programmers.
XSLT has a few problems (it outputs well-formed HTML which is not the same as ppl are used to, and it is an extra level of technology to learn) but I love the idea of it and it promises better separation of presentation a business layers.
Frank Carver

Joined: Jan 07, 1999
Posts: 6920
Don't forget the many tried-and-tested template frameworks which are already available. I personally prefer WebMacro but you should also check out Velocity (from the Apache Project) and FreeMarker.
In many cases using a templating system is much simpler and more flexible than using heavyweight solutions like JSP or XSLT. You can add WebMacro support to any java program just by including a single jar file in your classpath - you don't need a specific type of server or all the complexities of installing XML parsers and XSLT processors. It's fast too!

Read about me at ~ Raspberry Alpha Omega ~ Frank's Punchbarrel Blog
Ganesh Anekar
Ranch Hand

Joined: May 13, 2001
Posts: 36
write a class for html template and use string tokanizer method
create all html as objects , put that in one package and call that package in u r servlet code....
for code u can refer developing java servlets book...sams publication...
u will get it...try...

[This message has been edited by Ganesh Anekar (edited May 15, 2001).]
[This message has been edited by Ganesh Anekar (edited May 15, 2001).]
I agree. Here's the link:
subject: HTML Parsing inside a Servlet
It's not a secret anymore!