• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Jeanne Boyarsky
  • Ron McLeod
  • Paul Clapham
  • Liutauras Vilda
Sheriffs:
  • paul wheaton
  • Rob Spoor
  • Devaka Cooray
Saloon Keepers:
  • Stephan van Hulst
  • Tim Holloway
  • Carey Brown
  • Frits Walraven
  • Tim Moores
Bartenders:
  • Mikalai Zaikin

Substitute HTML tags?

 
Ranch Hand
Posts: 186
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hey guys,

Want to substitute/parse out the html tags that I receive from a previously submitted form, so users cannot submit html tags that will change the later presentation of that text.

Which is the smartest way of doing this?
Theres gotto be some class in the java class lib that will help with identifying html tags, but I havent found any so far.

Thanks in advance.

Regards
 
Sheriff
Posts: 67747
173
Mac Mac OS X IntelliJ IDE jQuery TypeScript Java iOS
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Are you talking about situations where a customer enters something like "</html>" for their name or something?

The best way to handle that is to HTML-escape the string whenever you display text on a page that came from user entry. If you are using the JSTL on your JSP pages (and if not, why not?), use of the <c:out> tag will automatically convert the angle bracket character to their HTML entity equivalents.
[ May 02, 2005: Message edited by: Bear Bibeault ]
 
Dominic Steng�rd
Ranch Hand
Posts: 186
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Thx guys for your quick reply.

Im actually not using JSP for the view of the user input (Im aware that is way more convenient, but Im building on top of an older system), but the page that the user input is displayed upon is generated from a plain old servlet.

How can I HTML-escape the string from a Servlet? Havent used that technique before.

Thanks again!

Regards
 
Sheriff
Posts: 13411
Firefox Browser VI Editor Redhat
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Look at the replaceAll method of the java.lang.String class if you're using 1.4 or higher (otherwise, look at replace).

Change any instance of '<' to "&lt;" and any instance of '>' to "&gt;".
[ May 02, 2005: Message edited by: Ben Souther ]
 
Bear Bibeault
Sheriff
Posts: 67747
173
Mac Mac OS X IntelliJ IDE jQuery TypeScript Java iOS
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Most HTML-escapers will also substitue &amp; for the & character.
 
Ranch Hand
Posts: 225
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
There are a few other html tags that are escaped as well, right?

So, in effect, make a list with all HTML tags and a corresponding one with their replacements and do a String.replace or String.replaceAll on all the elements of the list?

Or use a HashMap with HTML tags as names and the escape sequences as values and do the String.replace/replaceAll.

i guess that will work, but is that the most efficient way of doing it??
 
Ben Souther
Sheriff
Posts: 13411
Firefox Browser VI Editor Redhat
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Originally posted by Neeraj Dheer:
There are a few other html tags that are escaped as well, right?

So, in effect, make a list with all HTML tags and a corresponding one with their replacements and do a String.replace or String.replaceAll on all the elements of the list?

Or use a HashMap with HTML tags as names and the escape sequences as values and do the String.replace/replaceAll.

i guess that will work, but is that the most efficient way of doing it??



It's not a tag if it doesn't start with <.
 
Neeraj Dheer
Ranch Hand
Posts: 225
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator


It's not a tag if it doesn't start with <.



yup..i couldnt agree more...
i had the same problem while using XML/XSLT in servlets.
i had to escape all html characters in javascript etc in the XSL files else the transformation would throw an error.

Probably we could have some kind of 'intelligent' mechanism for escaping only the 'right characters' ?
 
Author and all-around good cowpoke
Posts: 13078
6
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
The Java regular expression package is your friend when doing this sort of thing. See java.util.regex.Pattern etc. - your class can have a static Pattern preset to recognize certain character sequences - for example:

It takes a bit of fiddling but works fast when you have it figured out.
Bill
 
Ben Souther
Sheriff
Posts: 13411
Firefox Browser VI Editor Redhat
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Originally posted by William Brogden:
The Java regular expression package is your friend when doing this sort of thing. See java.util.regex.Pattern etc. - your class can have a static Pattern preset to recognize certain character sequences - for example:

It takes a bit of fiddling but works fast when you have it figured out.
Bill



Yes, the String object's replaceAll function that I mentioned in my earlier post takes as it's first argument a regular expression.
reply
    Bookmark Topic Watch Topic
  • New Topic