Win a copy of Design for the Mind this week in the Design forum!
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic

validating a byte array for some encoding

 
Surasak Leenapongpanit
Ranch Hand
Posts: 341
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi all
I have a byte array that may be converted to a String with some specified encoding, like so:
String encodedChars = new String(bytes, encoding);
If the specified encoding is not supported, this throws an exception. If however there are invalid characters in the byte array, they are simply dropped from the String result - I wish I could get an exception.
How can I check that all characters in the byte array are valid for the specified encoding?
 
Lasse Koskela
author
Sheriff
Posts: 11962
5
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Do you know how long (how many characters) the resulting String should be? That would be easy to check. Other than that, I have no clue.
 
Jim Yingst
Wanderer
Sheriff
Posts: 18671
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
You need the java.nio.charsets package in JDK 1.4+:

Unfortunately the CharacterCodingException doesn't seem to include correct info about the position at which the error occurred - I keep getting "Input lenght = 1" even when the error isn't at the beginning of the string. I suppose you could loop through and decode each byte individually, to learn where the errors really are. But that's inelegant considering we're using nio, which is supposed to support bulk operations. Also it would be more complex if our target encoding were a variable-length encoding like UTF-8 rather than US-ASCII, since we don't know in advance how many bytes are required to make up a single char.
 
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic