aspose file tools
The moose likes Java in General and the fly likes Character Encoding and Crontab Issue Big Moose Saloon
  Search | Java FAQ | Recent Topics
Register / Login


JavaRanch » Java Forums » Java » Java in General
Reply Bookmark "Character Encoding and Crontab Issue" Watch "Character Encoding and Crontab Issue" New topic
Author

Character Encoding and Crontab Issue

Nikhil Lanjewar
Greenhorn

Joined: Jan 21, 2009
Posts: 15
Hi,

I'm trying to develop an interceptor for my RSS Parser. It's something like my RSS Parser is capable of reading a few standard RSS Feeds. It fails to read the dirty RSS e.g. the ones coming from an IIS environment where character encoding is not provided explicitly. My Java code first fetches the RSS Feed from some source URL and then fixes the dirty stuff so that my RSS Parser can recognise the feed. It works well when run manually either from Eclipe or from the Shell. However, it shows different results when run as a Cron Job.

I could find out that the character encoding for ISO charsets is different from that of UTF-8. Special characters are treated differently in both. My program encodes everything uniformly to UTF-8. But when the same program is added to a crontab, the special character's are replaced by a question mark each "?" .

I wonder what is it falling short of when run as a cron job. am I missing out on something? Is there any environment variable that needs to be set before running my Java code or something like that?
Martijn Verburg
author
Bartender

Joined: Jun 24, 2003
Posts: 3268

Hi Nikhil,

What environment does the crontab run in, e.g. What user with what LANG settings etc? You may find it's running as a language/encoding that doesn't support UTF-8.


Cheers, Martijn - Blog,
Twitter, PCGen, Ikasan, My The Well-Grounded Java Developer book!,
My start-up.
Nikhil Lanjewar
Greenhorn

Joined: Jan 21, 2009
Posts: 15
Hi Martjin

I'm running the code with an altogether different user than the one which I used for testing it manually. This user has got exclusive rights on the directories which it needs to modify. Can you please elaborate upon the LANG settings regarding language/encoding and how to check the same?
Nikhil Lanjewar
Greenhorn

Joined: Jan 21, 2009
Posts: 15
Bull's Eye!!

Zillion thanks to Martjin!
Setting the LANG environment variable to "en_US.UTF-8" worked like a charm... I had an intuition that some environment variable has not been set, but could not figure out the culprit. Thanks for pointing it out.
Martijn Verburg
author
Bartender

Joined: Jun 24, 2003
Posts: 3268

No worries! I had similar issues when doing some Japanese FTP server integration, that's when I re-read Joel's article.
 
I agree. Here's the link: http://aspose.com/file-tools
 
subject: Character Encoding and Crontab Issue
 
Similar Threads
Unable be to read Special character (OS specific)
creating a cron file
Parsing RSS2.0 feeds using XML Pull Parser
multiple language support in one XML
Encoding Type Of XML Document