aspose file tools
The moose likes Java in General and the fly likes How to write chars in range x80 - xFF to a text file and read with java Big Moose Saloon
  Search | Java FAQ | Recent Topics
Register / Login


Win a copy of The Mikado Method this week in the Agile and other Processes forum!
JavaRanch » Java Forums » Java » Java in General
Reply Bookmark "How to write chars in range x80 - xFF to a text file and read with java" Watch "How to write chars in range x80 - xFF to a text file and read with java" New topic
Author

How to write chars in range x80 - xFF to a text file and read with java

Ravi Danum
Ranch Hand

Joined: Jan 13, 2009
Posts: 104
Hello,

I want to write some invalid, invisible characters to a file so I can test my code.

For example, I want to write characters in the range x80 - xFF to a text file.

How can I do this? I want to be able to read the text with java and determine that

they are characters in the range x80 - xFF.

Thanks for any help.

-ravi
Paul Clapham
Bartender

Joined: Oct 14, 2005
Posts: 16483
    
    2

Here's the official description of the characters in that range: http://www.unicode.org/charts/PDF/U0080.pdf. As you can see, some of them are valid, some are valid and invisible (control characters), and only a couple are invalid (don't have any official description).

Now when you say you want to write them to a "text file", exactly what do you mean? A "text file" is one which contains text, so it doesn't make any sense to write data into it which isn't text. And when you write text to a file, that requires converting the (16-bit) chars to 8-bit bytes. This is done by using one of the available encodings. Usually encodings convert characters which they can't deal with -- such as your "invalid invisible" characters -- to question marks, on the ground that they aren't text.

So there really doesn't seem to be any point in what you are trying to do. If you have a program which expects to work with text, then using non-text characters is a pointless sort of test. On the other hand if you aren't really processing text, but some binary format instead, then you should just work with bytes in your code instead of with characters.

Ravi Danum
Ranch Hand

Joined: Jan 13, 2009
Posts: 104

Thank you so much for your reply.

I have to store it into a String. I would have to load the bytes into a String.

-ravi
Jeff Verdegan
Bartender

Joined: Jan 03, 2004
Posts: 5892
    
    6

Ravi Danum wrote:
Thank you so much for your reply.

I have to store it into a String. I would have to load the bytes into a String.

-ravi


You can't. Some of the bytes in that range do not map to valid Unicode characters.
Jeff Verdegan
Bartender

Joined: Jan 03, 2004
Posts: 5892
    
    6

Paul Clapham wrote:
So there really doesn't seem to be any point in what you are trying to do. If you have a program which expects to work with text, then using non-text characters is a pointless sort of test. On the other hand if you aren't really processing text, but some binary format instead, then you should just work with bytes in your code instead of with characters.


So you really need to clarify--to yourself first of all--what you're actually trying to do.

Does your code deal with text, and you need to test how it handles text that is outside the range it expects as input? Then don't worry about every byte from 0x80 to 0xFF. Worry about characters that are actual legal characters in the encodings you'll be dealing with and test some of them.

Does your code deal with text, and you need to test how it handles bytes that cannot be converted to text that it can handle? Then you don't care about chars, you care about bytes.
Paul Clapham
Bartender

Joined: Oct 14, 2005
Posts: 16483
    
    2

Ravi Danum wrote:I have to store it into a String. I would have to load the bytes into a String.


Whoever gave you that requirement was wrong. Strings are not containers for arbitrary bytes, they are containers for text. Use a byte array if you want a sequence of arbitrary bytes.
Jeff Verdegan
Bartender

Joined: Jan 03, 2004
Posts: 5892
    
    6

Paul Clapham wrote:
Ravi Danum wrote:I have to store it into a String. I would have to load the bytes into a String.


Whoever gave you that requirement was wrong. Strings are not containers for arbitrary bytes, they are containers for text. Use a byte array if you want a sequence of arbitrary bytes.


Indeed. They might as well be asking to store the color red in a double or the taste of spoiled sushi in a boolean.
 
I agree. Here's the link: http://ej-technologies/jprofiler - if it wasn't for jprofiler, we would need to run our stuff on 16 servers instead of 3.
 
subject: How to write chars in range x80 - xFF to a text file and read with java
 
Similar Threads
Reading a large text file ,modifying it and writing it to another file
Can I write in MS Word File from FileOutputStream Constructor
(FUNDAMENTALS)which use 8 bit US Ascii??
Java: Convert a binary file to "text" and back again.
how to write and read excel file