aspose file tools*
The moose likes JSP and the fly likes  JSP encoding: ISO-8859 and UTF-8 differences for Spanish characters Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login


Win a copy of Spring in Action this week in the Spring forum!
JavaRanch » Java Forums » Java » JSP
Bookmark " JSP encoding: ISO-8859 and UTF-8 differences for Spanish characters " Watch " JSP encoding: ISO-8859 and UTF-8 differences for Spanish characters " New topic
Author

JSP encoding: ISO-8859 and UTF-8 differences for Spanish characters

Robert Garrido
Ranch Hand

Joined: Dec 11, 2008
Posts: 30
Hi all,


I'm writing a multi-language application and I'm using UTF-8 encoding in my JSPs:

<%@ page contentType="text/html;charset=UTF-8" language="java" %>

in in order to render Spanish characters like á, é, í, ó, ú and more I need to use html entities like #243; for ó as an example.

If I use ISO-8859 I don't need to use entities to display Spanish characters, I just type directly from the keyboard.

The question is Why I cannot put the Spanish characters directly from the keyboard in UTF-8? Why is it always necessary to scape them? Isn't ISO a subset of UTF-8?

What do I have to make to render Spanish characters directly from the keyboard without entities in UTF-8?

Thanks a lot
Gamini Sirisena
Ranch Hand

Joined: Aug 05, 2008
Posts: 349
Are you saving the text in UTF-8 encoding through whatever editor you use for authoring the HTML files?
Robert Garrido
Ranch Hand

Joined: Dec 11, 2008
Posts: 30
Yes I do, I use intellij and I have set UTF-8 as the default encoding for all files.
Gamini Sirisena
Ranch Hand

Joined: Aug 05, 2008
Posts: 349
Are you using a character reference as &#<decimal code point value>; ?

What actually happens when you don't use the character reference but use the raw UTF-8 content? Does the browser display question marks or boxes? Or what actually happnes?
Robert Garrido
Ranch Hand

Joined: Dec 11, 2008
Posts: 30
Hi,

I use the characters directly from the keyboard, and the browser displays question marks and bokes as you said.

Thanks!
Gamini Sirisena
Ranch Hand

Joined: Aug 05, 2008
Posts: 349
What is your browser? Could it be that there is no unicode font capable of rendering the spanish characters?
Eoin MacI
Greenhorn

Joined: Sep 17, 2009
Posts: 2
Gamini Sirisena wrote:What is your browser? Could it be that there is no unicode font capable of rendering the spanish characters?


I'm having a similar problem - weirdly most of the non alphanumeric characters I'm trying to display work fine -
However Á, Í, Ï, Ð, Á, Ý all display incorrectly.

á, é, í, ó, ú , É, Ó, Ú all display correctly.

jsps are encoded in UTF-8 (with BOM)

The @page attribute is set correctly to <%@page pageEncoding="UTF-8" contentType="text/html; charset=UTF-8" language="java" session="true"%>

The html meta tag for Content type is set correctly to <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">

In web.xml, I've set jsp encoding to be UTF-8
<jsp-config>
<jsp-property-group>
<url-pattern>*.jsp</url-pattern>
<page-encoding>UTF-8</page-encoding>
</jsp-property-group>
</jsp-config>

and I'm applying Spring's character encoding filter also
<filter>
<filter-name>CharacterEncodingFilter</filter-name>
<filter-class>org.springframework.web.filter.CharacterEncodingFilter</filter-class>
<init-param>
<param-name>encoding</param-name>
<param-value>UTF-8</param-value>
</init-param>
<init-param>
<param-name>forceEncoding</param-name>
<param-value>true</param-value>
</init-param>
</filter>

<filter-mapping>
<filter-name>CharacterEncodingFilter</filter-name>
<servlet-name>/*</servlet-name>
<dispatcher>REQUEST</dispatcher>
<dispatcher>FORWARD</dispatcher>
<dispatcher>INCLUDE</dispatcher>
<dispatcher>ERROR</dispatcher>
</filter-mapping>


The characters display incorrectly in Firefox, IE6 and Opera, although IE6 displays ÿ instead of the required chars, whereas FF and Opera display two characters: ?

Font is Arial - so shouldn't be a problem with the font!
Anyone have any ideas??

Thanks

Eoin
Gamini Sirisena
Ranch Hand

Joined: Aug 05, 2008
Posts: 349
Just some more input..

UTF-8 is supposed to preserve ASCII.. but not sure about ISO-8859-1..

Is Arial a Unicode font? I am not sure. On Windows if you have Lucida Sans Unicode you should be ok on the font for latin and other BMP ranges.

Is it possible to post a URL with this UTF-8 content so that we could check from our ends?
Eoin MacI
Greenhorn

Joined: Sep 17, 2009
Posts: 2
Gamini Sirisena wrote:Just some more input..

UTF-8 is supposed to preserve ASCII.. but not sure about ISO-8859-1..

Is Arial a Unicode font? I am not sure. On Windows if you have Lucida Sans Unicode you should be ok on the font for latin and other BMP ranges.

Is it possible to post a URL with this UTF-8 content so that we could check from our ends?


Actually - have just discovered it works ok - but am applying a sitemesh filter, which is then corrupting the character encoding....

If i disable the sitemesh filter, it works fine!
 
I agree. Here's the link: http://aspose.com/file-tools
 
subject: JSP encoding: ISO-8859 and UTF-8 differences for Spanish characters