• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Jeanne Boyarsky
  • Ron McLeod
  • Paul Clapham
  • Liutauras Vilda
Sheriffs:
  • paul wheaton
  • Rob Spoor
  • Devaka Cooray
Saloon Keepers:
  • Stephan van Hulst
  • Tim Holloway
  • Carey Brown
  • Frits Walraven
  • Tim Moores
Bartenders:
  • Mikalai Zaikin

JSP encoding: ISO-8859 and UTF-8 differences for Spanish characters

 
Ranch Hand
Posts: 30
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hi all,


I'm writing a multi-language application and I'm using UTF-8 encoding in my JSPs:

<%@ page contentType="text/html;charset=UTF-8" language="java" %>

in in order to render Spanish characters like á, é, í, ó, ú and more I need to use html entities like #243; for ó as an example.

If I use ISO-8859 I don't need to use entities to display Spanish characters, I just type directly from the keyboard.

The question is Why I cannot put the Spanish characters directly from the keyboard in UTF-8? Why is it always necessary to scape them? Isn't ISO a subset of UTF-8?

What do I have to make to render Spanish characters directly from the keyboard without entities in UTF-8?

Thanks a lot
 
Ranch Hand
Posts: 378
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Are you saving the text in UTF-8 encoding through whatever editor you use for authoring the HTML files?
 
Robert Garrido
Ranch Hand
Posts: 30
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Yes I do, I use intellij and I have set UTF-8 as the default encoding for all files.
 
Gamini Sirisena
Ranch Hand
Posts: 378
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Are you using a character reference as &#<decimal code point value>; ?

What actually happens when you don't use the character reference but use the raw UTF-8 content? Does the browser display question marks or boxes? Or what actually happnes?
 
Robert Garrido
Ranch Hand
Posts: 30
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hi,

I use the characters directly from the keyboard, and the browser displays question marks and bokes as you said.

Thanks!
 
Gamini Sirisena
Ranch Hand
Posts: 378
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
What is your browser? Could it be that there is no unicode font capable of rendering the spanish characters?
 
Greenhorn
Posts: 2
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Gamini Sirisena wrote:What is your browser? Could it be that there is no unicode font capable of rendering the spanish characters?



I'm having a similar problem - weirdly most of the non alphanumeric characters I'm trying to display work fine -
However Á, Í, Ï, Ð, Á, Ý all display incorrectly.

á, é, í, ó, ú , É, Ó, Ú all display correctly.

jsps are encoded in UTF-8 (with BOM)

The @page attribute is set correctly to <%@page pageEncoding="UTF-8" contentType="text/html; charset=UTF-8" language="java" session="true"%>

The html meta tag for Content type is set correctly to <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">

In web.xml, I've set jsp encoding to be UTF-8
<jsp-config>
<jsp-property-group>
<url-pattern>*.jsp</url-pattern>
<page-encoding>UTF-8</page-encoding>
</jsp-property-group>
</jsp-config>

and I'm applying Spring's character encoding filter also
<filter>
<filter-name>CharacterEncodingFilter</filter-name>
<filter-class>org.springframework.web.filter.CharacterEncodingFilter</filter-class>
<init-param>
<param-name>encoding</param-name>
<param-value>UTF-8</param-value>
</init-param>
<init-param>
<param-name>forceEncoding</param-name>
<param-value>true</param-value>
</init-param>
</filter>

<filter-mapping>
<filter-name>CharacterEncodingFilter</filter-name>
<servlet-name>/*</servlet-name>
<dispatcher>REQUEST</dispatcher>
<dispatcher>FORWARD</dispatcher>
<dispatcher>INCLUDE</dispatcher>
<dispatcher>ERROR</dispatcher>
</filter-mapping>


The characters display incorrectly in Firefox, IE6 and Opera, although IE6 displays ÿ instead of the required chars, whereas FF and Opera display two characters: ?

Font is Arial - so shouldn't be a problem with the font!
Anyone have any ideas??

Thanks

Eoin
 
Gamini Sirisena
Ranch Hand
Posts: 378
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Just some more input..

UTF-8 is supposed to preserve ASCII.. but not sure about ISO-8859-1..

Is Arial a Unicode font? I am not sure. On Windows if you have Lucida Sans Unicode you should be ok on the font for latin and other BMP ranges.

Is it possible to post a URL with this UTF-8 content so that we could check from our ends?
 
Eoin MacI
Greenhorn
Posts: 2
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Gamini Sirisena wrote:Just some more input..

UTF-8 is supposed to preserve ASCII.. but not sure about ISO-8859-1..

Is Arial a Unicode font? I am not sure. On Windows if you have Lucida Sans Unicode you should be ok on the font for latin and other BMP ranges.

Is it possible to post a URL with this UTF-8 content so that we could check from our ends?



Actually - have just discovered it works ok - but am applying a sitemesh filter, which is then corrupting the character encoding....

If i disable the sitemesh filter, it works fine!
 
reply
    Bookmark Topic Watch Topic
  • New Topic