File APIs for Java Developers
Manipulate DOC, XLS, PPT, PDF and many others from your application.
The moose likes Swing / AWT / SWT and the fly likes Working with the Hebrew alphabet in JTextArea Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login
JavaRanch » Java Forums » Java » Swing / AWT / SWT
Bookmark "Working with the Hebrew alphabet in JTextArea" Watch "Working with the Hebrew alphabet in JTextArea" New topic

Working with the Hebrew alphabet in JTextArea

Chris Kimball
Ranch Hand

Joined: Apr 23, 2012
Posts: 32

On a Mac OSX system the application takes input from a file, keyboard, or from pasting into a JTextArea. The input is processed from the characters of the JTextArea's javax.swing.text.Document. All works fine until Hebrew characters are entered (0590-05ff in Unicode).

Pasting of Unicode Hebrew (from Firefox) looks correct, but the characters of the document are only single bytes, 0x3F, when taken from the document. When the JTextArea is filled
from an input file, via a CharacterStream, the Hebrew characters are replaced by nonsense. A Hebrew keyboard typing into the JTextArea produces a correct appearance, but the results returned from the Document are still short, 8 bit.

I've read somewhere that some AWT components aren't really Unicode compatible, but that Swing components are.

Would you please point me towards a solution?


Chris Kimball
Paul Clapham

Joined: Oct 14, 2005
Posts: 19973

The '3F' character is a question mark. This suggests to me that at some point in the process your characters are undergoing an encoding to bytes, using a charset which doesn't support those Hebrew characters.

I don't see any place in your description which suggests a conversion from chars to bytes, but then I can't exactly tell what the process is. I started to write a test program (although I don't have a Mac) but then I realized I didn't know exactly what you were doing with the Document. So could you post a small test program which demonstrates the problem?
Chris Kimball
Ranch Hand

Joined: Apr 23, 2012
Posts: 32

Thanks for your observation. I thought the '?' was because the system font didn't support Hebrew characters. My app shouldn't convert char to bytes, HOWEVER, the analysis I gave did foolishly use new String(char[] x).getBytes() rather than getbytes("UTF-16"). When I did this, the Hebrew characters were recognizable from the JTextArea.

I'm now searching for a char to bytes operation in my software. Thanks for the tip that '?' is more than a font failure indication.

I agree. Here's the link:
subject: Working with the Hebrew alphabet in JTextArea
It's not a secret anymore!