aspose file tools*
The moose likes Java in General and the fly likes How do I read a file containing non-english text? Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login
JavaRanch » Java Forums » Java » Java in General
Bookmark "How do I read a file containing non-english text?" Watch "How do I read a file containing non-english text?" New topic
Author

How do I read a file containing non-english text?

Vasudhaiv Naresh
Ranch Hand

Joined: May 13, 2005
Posts: 57
Hi All,
I have a problem wherein I have two text files containing non English text (say, Hindi for instance). I have to compare the contents of the two files. Can anybody help me as to how I can do this using Java?
Thanks,
Naresh
Ulf Dittmer
Marshal

Joined: Mar 22, 2005
Posts: 42935
    
  68
Why would comparing non-English text be any different than comparing English text? Java uses Unicode internally, so once the text is memory, it's all the same anyway.
Jesper de Jong
Java Cowboy
Saloon Keeper

Joined: Aug 16, 2005
Posts: 14435
    
  23

The char data type in Java is a 16-bit Unicode character. It can contain Hindi characters as well as English (Latin-1) characters. There should be no difference in handling these character sets.

How exactly do you need to compare the files? Do you just have to check if they are exactly the same or not? If that's the case, you don't need to worry about character encodings at all; you can just read the files byte by byte and compare the bytes.


Java Beginners FAQ - JavaRanch SCJP FAQ - The Java Tutorial - Java SE 8 API documentation
Vasudhaiv Naresh
Ranch Hand

Joined: May 13, 2005
Posts: 57
Thanks that helps me.
 
I agree. Here's the link: http://aspose.com/file-tools
 
subject: How do I read a file containing non-english text?