| Author |
How to write chars in range x80 - xFF to a text file and read with java
|
Ravi Danum
Ranch Hand
Joined: Jan 13, 2009
Posts: 104
|
|
Hello,
I want to write some invalid, invisible characters to a file so I can test my code.
For example, I want to write characters in the range x80 - xFF to a text file.
How can I do this? I want to be able to read the text with java and determine that
they are characters in the range x80 - xFF.
Thanks for any help.
-ravi
|
 |
Paul Clapham
Bartender
Joined: Oct 14, 2005
Posts: 16483
|
|
Here's the official description of the characters in that range: http://www.unicode.org/charts/PDF/U0080.pdf. As you can see, some of them are valid, some are valid and invisible (control characters), and only a couple are invalid (don't have any official description).
Now when you say you want to write them to a "text file", exactly what do you mean? A "text file" is one which contains text, so it doesn't make any sense to write data into it which isn't text. And when you write text to a file, that requires converting the (16-bit) chars to 8-bit bytes. This is done by using one of the available encodings. Usually encodings convert characters which they can't deal with -- such as your "invalid invisible" characters -- to question marks, on the ground that they aren't text.
So there really doesn't seem to be any point in what you are trying to do. If you have a program which expects to work with text, then using non-text characters is a pointless sort of test. On the other hand if you aren't really processing text, but some binary format instead, then you should just work with bytes in your code instead of with characters.
|
 |
Ravi Danum
Ranch Hand
Joined: Jan 13, 2009
Posts: 104
|
|
Thank you so much for your reply.
I have to store it into a String. I would have to load the bytes into a String.
-ravi
|
 |
Jeff Verdegan
Bartender
Joined: Jan 03, 2004
Posts: 5892
|
|
Ravi Danum wrote:
Thank you so much for your reply.
I have to store it into a String. I would have to load the bytes into a String.
-ravi
You can't. Some of the bytes in that range do not map to valid Unicode characters.
|
 |
Jeff Verdegan
Bartender
Joined: Jan 03, 2004
Posts: 5892
|
|
Paul Clapham wrote:
So there really doesn't seem to be any point in what you are trying to do. If you have a program which expects to work with text, then using non-text characters is a pointless sort of test. On the other hand if you aren't really processing text, but some binary format instead, then you should just work with bytes in your code instead of with characters.
So you really need to clarify--to yourself first of all--what you're actually trying to do.
Does your code deal with text, and you need to test how it handles text that is outside the range it expects as input? Then don't worry about every byte from 0x80 to 0xFF. Worry about characters that are actual legal characters in the encodings you'll be dealing with and test some of them.
Does your code deal with text, and you need to test how it handles bytes that cannot be converted to text that it can handle? Then you don't care about chars, you care about bytes.
|
 |
Paul Clapham
Bartender
Joined: Oct 14, 2005
Posts: 16483
|
|
Ravi Danum wrote:I have to store it into a String. I would have to load the bytes into a String.
Whoever gave you that requirement was wrong. Strings are not containers for arbitrary bytes, they are containers for text. Use a byte array if you want a sequence of arbitrary bytes.
|
 |
Jeff Verdegan
Bartender
Joined: Jan 03, 2004
Posts: 5892
|
|
Paul Clapham wrote:
Ravi Danum wrote:I have to store it into a String. I would have to load the bytes into a String.
Whoever gave you that requirement was wrong. Strings are not containers for arbitrary bytes, they are containers for text. Use a byte array if you want a sequence of arbitrary bytes.
Indeed. They might as well be asking to store the color red in a double or the taste of spoiled sushi in a boolean.
|
 |
 |
|
|
subject: How to write chars in range x80 - xFF to a text file and read with java
|
|
|