File APIs for Java Developers
Manipulate DOC, XLS, PPT, PDF and many others from your application.
http://aspose.com/file-tools
The moose likes I/O and Streams and the fly likes Different OS, same file, different results Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login
JavaRanch » Java Forums » Java » I/O and Streams
Bookmark "Different OS, same file, different results" Watch "Different OS, same file, different results" New topic
Author

Different OS, same file, different results

Alex Armenteros
Ranch Hand

Joined: May 05, 2010
Posts: 73
I'm developing a XML file parser and I'm having problems with characters like Ñ or Á

On windows, it reads the file correctly.

On CentOS 5, the special characters are read wrong


I've tried changing the encoding with ultraedit and no result.

Thanks in advance
Jeff Verdegan
Bartender

Joined: Jan 03, 2004
Posts: 6109
    
    6

How are you determining that they're wrong? If you're looking at them in a text editor or console window, there's a decent chance that you've got the right character, but your viewer isn't displaying it correctly, either because it's not using the right encoding or because the font doesn't include a glyph for that character. Here's one way to know for sure what you're getting:



Presumably you're writing and reading lines at a time in a loop, but the same thing still applies. Pick a couple of troublesome characters from the Strings, and print out their int values before writing and after reading. If you're not seeing the same values, there's a problem with your code, so post an SSCCE that shows what you're doing and your expected and actual output. If you do see the same values, then the problem is just with the viewer you're using.
Alex Armenteros
Ranch Hand

Joined: May 05, 2010
Posts: 73
Yeah I forgot that after parsing the xml file I insert values in an Oracle DB

With the process running on Windows Ñ and accents are show correctly in the DB,
running on CentOS don't.
Jeff Verdegan
Bartender

Joined: Jan 03, 2004
Posts: 6109
    
    6

Everything I said still applies. You need to look at the int values of the characters to determine whether the data is getting corrupted or if it's just your viewer that's not displaying them correctly.
Alex Armenteros
Ranch Hand

Joined: May 05, 2010
Posts: 73
As you aid mr Verdegan, the bytes are different.


Windows

> 209

CentOS

< 195
< 145

This is a file reading issue

because if I do this:


On both cases 209 is printed.
Alex Armenteros
Ranch Hand

Joined: May 05, 2010
Posts: 73
Finally I got it to work by doing the most absurd and dirty way I could think of...



 
I agree. Here's the link: http://aspose.com/file-tools
 
subject: Different OS, same file, different results