• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Jeanne Boyarsky
  • Ron McLeod
  • Paul Clapham
  • Liutauras Vilda
Sheriffs:
  • paul wheaton
  • Rob Spoor
  • Devaka Cooray
Saloon Keepers:
  • Stephan van Hulst
  • Tim Holloway
  • Carey Brown
  • Frits Walraven
  • Tim Moores
Bartenders:
  • Mikalai Zaikin

How to write chars in range x80 - xFF to a text file and read with java

 
Ranch Hand
Posts: 165
1
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hello,

I want to write some invalid, invisible characters to a file so I can test my code.

For example, I want to write characters in the range x80 - xFF to a text file.

How can I do this? I want to be able to read the text with java and determine that

they are characters in the range x80 - xFF.

Thanks for any help.

-ravi
 
Marshal
Posts: 28193
95
Eclipse IDE Firefox Browser MySQL Database
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Here's the official description of the characters in that range: http://www.unicode.org/charts/PDF/U0080.pdf. As you can see, some of them are valid, some are valid and invisible (control characters), and only a couple are invalid (don't have any official description).

Now when you say you want to write them to a "text file", exactly what do you mean? A "text file" is one which contains text, so it doesn't make any sense to write data into it which isn't text. And when you write text to a file, that requires converting the (16-bit) chars to 8-bit bytes. This is done by using one of the available encodings. Usually encodings convert characters which they can't deal with -- such as your "invalid invisible" characters -- to question marks, on the ground that they aren't text.

So there really doesn't seem to be any point in what you are trying to do. If you have a program which expects to work with text, then using non-text characters is a pointless sort of test. On the other hand if you aren't really processing text, but some binary format instead, then you should just work with bytes in your code instead of with characters.

 
Ravi Danum
Ranch Hand
Posts: 165
1
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Thank you so much for your reply.

I have to store it into a String. I would have to load the bytes into a String.

-ravi
 
Bartender
Posts: 6109
6
Android IntelliJ IDE Java
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Ravi Danum wrote:
Thank you so much for your reply.

I have to store it into a String. I would have to load the bytes into a String.

-ravi



You can't. Some of the bytes in that range do not map to valid Unicode characters.
 
Jeff Verdegan
Bartender
Posts: 6109
6
Android IntelliJ IDE Java
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Paul Clapham wrote:
So there really doesn't seem to be any point in what you are trying to do. If you have a program which expects to work with text, then using non-text characters is a pointless sort of test. On the other hand if you aren't really processing text, but some binary format instead, then you should just work with bytes in your code instead of with characters.



So you really need to clarify--to yourself first of all--what you're actually trying to do.

Does your code deal with text, and you need to test how it handles text that is outside the range it expects as input? Then don't worry about every byte from 0x80 to 0xFF. Worry about characters that are actual legal characters in the encodings you'll be dealing with and test some of them.

Does your code deal with text, and you need to test how it handles bytes that cannot be converted to text that it can handle? Then you don't care about chars, you care about bytes.
 
Paul Clapham
Marshal
Posts: 28193
95
Eclipse IDE Firefox Browser MySQL Database
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Ravi Danum wrote:I have to store it into a String. I would have to load the bytes into a String.



Whoever gave you that requirement was wrong. Strings are not containers for arbitrary bytes, they are containers for text. Use a byte array if you want a sequence of arbitrary bytes.
 
Jeff Verdegan
Bartender
Posts: 6109
6
Android IntelliJ IDE Java
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Paul Clapham wrote:

Ravi Danum wrote:I have to store it into a String. I would have to load the bytes into a String.



Whoever gave you that requirement was wrong. Strings are not containers for arbitrary bytes, they are containers for text. Use a byte array if you want a sequence of arbitrary bytes.



Indeed. They might as well be asking to store the color red in a double or the taste of spoiled sushi in a boolean.
 
reply
    Bookmark Topic Watch Topic
  • New Topic