aspose file tools*
The moose likes Java in General and the fly likes CharsetDecoder Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login
JavaRanch » Java Forums » Java » Java in General
Bookmark "CharsetDecoder" Watch "CharsetDecoder" New topic
Author

CharsetDecoder

Marcel de Jong
Ranch Hand

Joined: May 27, 2002
Posts: 54
Hello,

I have a question about the CharsetDecoder. According to some searches (searches)
the following code should work:

I am expecting a CharacterCodingException, which is what I want, but unfortunately
no exception is thrown.

Does any one know why the exception isn't thrown and how I can rewrite the code
so that an exception will be thrown.

BTW: The Euro sign is not part of the (cp437 encoding)

Thanks in advance.
Marcel
Scott Escue
Ranch Hand

Joined: Jan 20, 2005
Posts: 34
Marcel,

I'm new to the java.nio package and it's been awhile since I worked with character encodings, but I think I can offer some help.

So it looks like the CharsetEncoder replaces any unmappable or malformed characters rather than reporting them via exception, by default. See the snippet below on how to indicate you want errors reported. There's also an example of this in the thread you linked to.


That being said, I'm not sure decode is what your looking for. You'll notice that you still won't get an exception after making the changes above.
In fact, if you loop through the individual bytes of your param variable and print each one you'll get an output of -128. And if you print the result from your decode you'll get the � character. The code page you linked to indicates that 128 maps to the � character (I'm conveniently ignoring the negative sign because I don't know why its there or how it gets handled ) So it at least appears that encode is working correctly.

If you're trying to determine whether the unicode character maps to the IBM437 character set, I believe you want to try to encode the character rather than decode it. You can create a CharsetEncoder that throws exceptions in the same manner as the CharsetDecoder is created above. Using that encoder to try to encode your unicode character to IBM437 does in fact throw an UnmappableCharacterException.

I hope this helps.
[ April 18, 2007: Message edited by: Scott Escue ]
 
I agree. Here's the link: http://aspose.com/file-tools
 
subject: CharsetDecoder
 
Similar Threads
Mac shell won't print Accented Unicode characters
validating a byte array for some encoding
Read Arabic text in Servlet
Need Help in parsing Japanese SHIFT JIS Characters in Java
Validating UTF-8 encoding using CharsetDecode