This week's giveaway is in the Android forum.
We're giving away four copies of Android Security Essentials Live Lessons and have Godfrey Nolan on-line!
See this thread for details.
The moose likes Ranch Office and the fly likes Error in HTML escaping Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login


Win a copy of Android Security Essentials Live Lessons this week in the Android forum!
JavaRanch » Java Forums » This Site » Ranch Office
Bookmark "Error in HTML escaping" Watch "Error in HTML escaping" New topic
Author

Error in HTML escaping

Wouter Oet
Saloon Keeper

Joined: Oct 25, 2008
Posts: 2700

When I post this sentence:

List<List<?>> lists = new ArrayList<List<?>>();

and the checkbox "Disable HTML in this message" is off then it will result in:

List<List><?>> lists = new ArrayList<List><?>>();

When the checkbox is checked then it works fine.


"Any fool can write code that a computer can understand. Good programmers write code that humans can understand." --- Martin Fowler
Please correct my English.
W. Joe Smith
Ranch Hand

Joined: Feb 10, 2009
Posts: 710
I see no difference between the two. What is everyone else seeing?


SCJA
When I die, I want people to look at me and say "Yeah, he might have been crazy, but that was one zarkin frood that knew where his towel was."
Wouter Oet
Saloon Keeper

Joined: Oct 25, 2008
Posts: 2700

I see
List [Less-than sign] List [More-than sign] [Less-than sign] ? [More-than sign][More-than sign]
while it should be:
List [Less-than sign] List [Less-than sign] ? [More-than sign][More-than sign]

I hope you can read it
W. Joe Smith
Ranch Hand

Joined: Feb 10, 2009
Posts: 710
OK, I lied. I see it now.

So many signs and no TV and no beer make Homer something...something...
Vikas Kapoor
Ranch Hand

Joined: Aug 16, 2007
Posts: 1374
I think this is known issue. Search this forum.
Wouter Oet
Saloon Keeper

Joined: Oct 25, 2008
Posts: 2700

In my first search attempt (before I posted) I found nothing. In my second search I found this.
But using &lt; instead of < doesn't sound like a solution to this problem.
Ulf Dittmer
Marshal

Joined: Mar 22, 2005
Posts: 41124
    
  45
This is a known issue with the forum software. It uses a library for sorting out the HTML, and that library gets confused by Java generics (which also use angle brackets, but have rather different notions about what constitutes wellformedness). So far, we haven't been able to get anyone to look into this :-(


Ping & DNS - my free Android networking tools app
Wouter Oet
Saloon Keeper

Joined: Oct 25, 2008
Posts: 2700

I looked at the code of the forum software. I found a lot of undocumented code and somewhat old technologies which I'm not familiar with.

I also found this:


That looks right. So I viewed ViewCommon.java:260

That looks also right but above that method I found this:

And that is wrong because the less than sign is missing a semicolon
I didn't test if correcting this would fix our problem. But it's a start
Christophe Verré
Sheriff

Joined: Nov 24, 2005
Posts: 14687
    
  16

Thank you. A semi-colon is missing indeed. But it appears that this method is not used. The problem comes from a third party library, not by jforum.


[My Blog]
All roads lead to JavaRanch
Ulf Dittmer
Marshal

Joined: Mar 22, 2005
Posts: 41124
    
  45
That's an interesting find, but a quick grep through the source shows that this method isn't called from anywhere (maybe not too surprising since it's called "espaceHtml" instead of "escapeHtml" - not too obvious :-)

No, the problem is somewhere within the http://htmlparser.sourceforge.net/ library, which has bugs, but is abandoned now. I recall that we fixed at least one bug in it (for which luckily a patch had been posted on SourceForge).
Wouter Oet
Saloon Keeper

Joined: Oct 25, 2008
Posts: 2700

I don't know if the method is used. Eclipse couldn't find references. However the method could be called from the "view" because java code is used there and I don't think eclipse can track that.
Ulf Dittmer
Marshal

Joined: Mar 22, 2005
Posts: 41124
    
  45
Just to show that the library has other problems as well: This is the RSS feed for the Ranch Office forum (the link for which you can find on the forum home page). The entry for this topic starts with "When I post this sentence: >" - which has an extra closing angle bracket; that can be found in many RSS feed entries. It's trying to sanitize HTML by adding angle brackets where it thinks they're missing, but is thrown off by the generics notation, since that isn't nested in the same way as HTML/XML.
Wouter Oet
Saloon Keeper

Joined: Oct 25, 2008
Posts: 2700

I had some posts that had an > at the end. I thought I had made a typo but now I know it was the library. I did a test run with the library version 1.5 (currently used by jforum) and I can confirm that the library is to blame.
Wouter Oet
Saloon Keeper

Joined: Oct 25, 2008
Posts: 2700

Version 1.6 doesn't solve it and 2.0SNAPSHOT has a different architecture. If I have time for it I'll look into it tomorrow.
Wouter Oet
Saloon Keeper

Joined: Oct 25, 2008
Posts: 2700

The snapshot version doesn't have a different architecture but my IDE thought it would be funny to mark valid imports as invalid. Now the bad news the snapshot version doesn't solve it
Now I'm going to bed. It's 2.30 AM
Christophe Verré
Sheriff

Joined: Nov 24, 2005
Posts: 14687
    
  16

Good night
Wouter Oet
Saloon Keeper

Joined: Oct 25, 2008
Posts: 2700

I've been thinking (to be honest I've been dreaming about the problem ). A simple solution would be to set "Disable HTML in this message" default to true because 99.9% of the posts is without HTML.
Ulf Dittmer
Marshal

Joined: Mar 22, 2005
Posts: 41124
    
  45
Hmm ... it is true that for most HTML tags we allow, UBB alternatives exist. Still, it would be nice to fix that library for the RSS feeds.
Wouter Oet
Saloon Keeper

Joined: Oct 25, 2008
Posts: 2700

That is true. I tried to put the text in <![CDATA[ tags but that didn't help >
Wouter Oet
Saloon Keeper

Joined: Oct 25, 2008
Posts: 2700

But will my suggestion be implemented?
Ulf Dittmer
Marshal

Joined: Mar 22, 2005
Posts: 41124
    
  45
That checkbox is based on the "Allow HTML" setting in your profile, so you can turn it off by default for all of your posts. Given that -thanks to UBB- there is really not much need for HTML posting, I think it makes sense to change the profile to not allow HTML by default for new accounts, though. Most people will probably leave it turned off and never notice the difference :-)
Ulf Dittmer
Marshal

Joined: Mar 22, 2005
Posts: 41124
    
  45
Generics code is now properly displayed, no matter what the "Disable HTML" setting is set to.
 
I agree. Here's the link: http://aspose.com/file-tools
 
subject: Error in HTML escaping
 
Similar Threads
Applet Project
Three Dimensional ArrayList?
Struts - Dynamically Populate checkbox
Casting generic collections
dynamic Array