Win a copy of Re-engineering Legacy Software this week in the Refactoring forum
or Docker in Action in the Agile forum!
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic

How to print Russian alphabet from console?

 
Siegfried Heintze
Ranch Hand
Posts: 403
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
The following perl program works when I run it from urxvt-X console on
cygwin-x windows when running on Microsoft Windows XP:

LC_CTYPE=en_US.UTF-8 urxvt-X.exe&
perl -wle "binmode STDOUT, q[:utf8]; print chr() for 0x410 .. 0x430;"

This little one liner prints the Russian alphabet in Cyrillic. With some
slight modification it will also print a lot of other alphabets too --
including Hebrew, chinese and japanese.

It does not work with cmd.exe because apparently cmd.exe cannot deal with
UTF-8.

Can someone help me translate it into groovy? I would not expect it to work
from cmd.exe with groovy, but I am hopeful it will work with urxvt-X!

Thanks,
Siegfried
 
Matthew Taylor
Rancher
Posts: 110
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Well, I can help you a bit. As far as how to have urxvt-x print the characters to the console, you are on your own. What I can help with is creating a range of hex values for your to work with and passing each to a command on the command line:



This 'hexRange' will get your a Groovy Range object containing the hex values you want. You can pass them to a command line script by calling execute() on a GString that contains your command as shown.
[ October 01, 2008: Message edited by: Matthew Taylor ]
 
Siegfried Heintze
Ranch Hand
Posts: 403
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I don't think I follow your logic here. Are you spawning the console shell for each letter? I don't think that is what I want.

I need to know the name of the function that is the counterpart to perl's "binmode" that tells the shell's codec that that we want to emit UTF8 (or UTF-16 I don't care) and the java (or groovy) System.out object needs to convert double byte char objects to UTF-8 (or, in the case of UTF-16 I believe no conversion is necessary).

I think by default, the codec in System.out takes the double byte char objects and converts them to ASCII since cmd.exe is brain damaged.

(Although, I did discover the "chcp 1251" command for cmd.exe that will allow me to print the russian alphabet using a python program. I have to remember to use the non-default font of Lucidia).

Perhaps this question is more appropriate for a java forum?

Thanks!
Siegfried
 
Marc Peabody
pie sneak
Sheriff
Posts: 4727
Mac Ruby VI Editor
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I'm not sure how to get the System.out console in Java to use Russian characters off the top of my head.

You can always dump it to a Swing window very easily though. Try dropping the following code into a GroovyConsole window and executing it.
 
Anthony Nassar
Greenhorn
Posts: 1
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
To generate the Cyrillic alphabet:



Java (and therefore Groovy) does recognize the case of Cyrillic characters, so Character#isUpperCase() and isLowerCase() both work, as do /\p{Lu}/ and /\p{Ll}/. /\p{Cyrillic}/, unfortunately, does not work.

The Groovy console has trouble with Cyrillic characters, so be careful; what you see might not be what you're getting.
 
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic