Win a copy of Design for the Mind this week in the Design forum!
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic

Load Unicode Filecontent to JTextArea

 
Daniel Palombo
Greenhorn
Posts: 10
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hello Ranchers,
I have a big problem with a little JTextArea.

When setting a text like this in a JTextArea, all looks fine:



But when I try to load this text from a file I get nonsense (square characters).

You may ask: How do you get this text to a file? I copy & paste it from the JTextArea. Then (in Windows XP) I generate a new textdocument with the default notepad. The unicode characters show here fine as well. At the moment I want to save it, I get the choise to save it in ANSI, Unicode, Unicode Big Endian or UTF-8. I choose Unicode.

To load the text to my JTextArea I do this:


Hmm, that's too simple I guess.
So, what might the magical lines be to get this rigth???

Thank you for your help!
Daniel Palombo
 
Randall Kippen
Greenhorn
Posts: 15
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
You should have something like:

InputStreamReader isr = new InputStreamReader(new FileInputStream(file), charset);

where charset is the same as what you used in notepad to save.
If you don't specify a charset, the default is something like ISO-8859-1.
Hence, the encoding/decoding doesn't match and it turns into gibberish.
 
Daniel Palombo
Greenhorn
Posts: 10
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Yeepyyy!
It works!

I begin to feel comfortable here

There is one thing that disturbs me a little bit: It's me (or the hard coded program) that has to know, in which charset the filecontent is. It wood be nicer if I could 'automagically' figure out the charset of the file and provide this to the InputStremReader ... (because I don' really know if it's in uinicode, Big Endian, Little Endian etc.)

Thank you very much for your help! Really!!

 
Randall Kippen
Greenhorn
Posts: 15
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator

There is one thing that disturbs me a little bit: It's me (or the hard coded program) that has to know, in which charset the filecontent is. It wood be nicer if I could 'automagically' figure out the charset of the file and provide this to the InputStremReader ... (because I don' really know if it's in uinicode, Big Endian, Little Endian etc.)

Thank you very much for your help! Really!!


If you are using fairly standard UTF encodings, then I think the following code should work. However, there is nothing that I know of that will detect every enoding. Also, when detecting the encoding, you will probably want to use a PushbackInputStream.

 
Brian Cole
Author
Ranch Hand
Posts: 903
1
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Originally posted by Daniel Palombo:
You may ask: How do you get this text to a file? I copy & paste it from the JTextArea. Then (in Windows XP) I generate a new textdocument with the default notepad. The unicode characters show here fine as well. At the moment I want to save it, I get the choise to save it in ANSI, Unicode, Unicode Big Endian or UTF-8. I choose Unicode.


You may have wanted to just choose UTF-8 here. Everything
else may have just worked, depending on the details.

see http://www.cl.cam.ac.uk/~mgk25/unicode.html#utf-8
 
Daniel Palombo
Greenhorn
Posts: 10
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
@Randall Kippen
Your code example is just wat brings me nearer to paradise !!!

@Brian Cole
I think I have to deepen my knowledge about this topic ... Thank you for the good link

Thanx again
 
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic