This week's book giveaway is in the OO, Patterns, UML and Refactoring forum. We're giving away four copies of Refactoring for Software Design Smells: Managing Technical Debt and have Girish Suryanarayana, Ganesh Samarthyam & Tushar Sharma on-line! See this thread for details.
Hi, we are using ESAPI for validating a user input in a web-based application. Currently we have troubles with validating content of an html editor (such as CK or TinyMCE): we get an exception that says that there are mixed encoding detected. It is thrown by a method called "canonicalize".
An the reason for it is that any html content can potentially contain two encodings: url encoding (whch for example %20) and html encoding (various html entities like &, etc.). Which from the html point of view this is completely valid.
Of course there is an option to switch off the detection of mixed encoding in ESAPI. However ESAPI says that it is more prefferable to keep it switched on to prevent XSS attacs.
So the question is what is the correct way of validating such a content?