I want to do some filtering of raw text from text files or database entries to publish on a web site. The problems include filtering out existing html tags, replacing email addresses with html email links, and creating html links from web addresses (starting with http ) The idea is to do it in a general enough way to make it easy to add and replace filters later on.
To start with I'd like input for howto make the replacements. Has anyone done this in a good/not so good way?
[ October 24, 2004: Message edited by: limpan luring ] [ October 24, 2004: Message edited by: limpan luring ]
Thanx. Had a quick look at htmlparser, but it seems to be a bit overkill for my purposes.
Here's what I did (quick and dirty):
... and so on. Applying these methods one after another (as is the idea behind the stuff) seems to be a bit inefficient, with all the String stuff going on. Any ideas on how to improve this? Not too sure about the regexps either ... [ October 24, 2004: Message edited by: limpan luring ]
Joined: Oct 12, 2000
why reinvent the wheel? You may think using an existing toolkit is overkill now but before you know you've rewritten over half the functionality...
As to email addresses: that's highly unreliable (unless your data is tabulated to tell you exactly where they are). There's too many possible things that can go wrong. is email@example.com an email address or not? And what about firstname.lastname@example.org ?