jQuery in Action, 3rd edition
The moose likes Java in General and the fly likes Editing html tags in a *.htm file Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login
JavaRanch » Java Forums » Java » Java in General
Bookmark "Editing html tags in a *.htm file" Watch "Editing html tags in a *.htm file" New topic

Editing html tags in a *.htm file

Arindham Samanta

Joined: Jan 25, 2004
Posts: 1
I had read a HTML file using through java.io. and i kept whole content of the html file in a string buffer. Now i want to replace the value of href="" in anchor tag ie <a href="">.
Mag Hoehme
Ranch Hand

Joined: Apr 07, 2002
Posts: 194
Hi Arindham,

I have my doubts whether StringBuffer is a suitable object type to accompish your task. As I understand it, you want to replace some URL in a href tag by another.

I would adopt the following strategy:

1. get the original String
2. open an empty StringBuffer
3. scanning the original String for href tags
4. copying the portions not to be modified to the StringBuffer as they are, and replace the portions to be modified by the new content.

Use two int variables, start and end, String methods such as indexOf, substring.

However, this approach can be further optimized towards a better performance. Most String methods, such as string.substring (), create new String objects for their return values. To avoid these unnecessary objects, you may choose to work on a char array, copying the chars one by one to a StringBuffer (or to a second char array). For more information on using char array to boost performance, use Google with "String" and "performance".

Hope this helps.
[ June 25, 2004: Message edited by: Mag Hoehme ]

Stan James
(instanceof Sidekick)
Ranch Hand

Joined: Jan 29, 2003
Posts: 8791
Look into regular expressions, there is good doc on the class Pattern. You can match and replace complicated patterns with a line or three of code and only a few days of head-banging to create a good pattern. (mild exaggeration )

You might also look into HTML parsers. I use This One quite happily and there are many others around. It uses the visitor pattern to help you scan a parsed document for particular tags and attributes like href. You can modify the DOM and rewrite it to HTML, though it may not be quite character-for-character exactly like it was before.

A good question is never answered. It is not a bolt to be tightened into place but a seed to be planted and to bear more seed toward the hope of greening the landscape of the idea. John Ciardi
I agree. Here's the link: http://aspose.com/file-tools
subject: Editing html tags in a *.htm file
It's not a secret anymore!