aspose file tools*
The moose likes Java in General and the fly likes ANTLR failing parsing of java file on accented character Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login
JavaRanch » Java Forums » Java » Java in General
Bookmark "ANTLR failing parsing of java file on accented character" Watch "ANTLR failing parsing of java file on accented character" New topic
Author

ANTLR failing parsing of java file on accented character

Maulin Vasavada
Ranch Hand

Joined: Nov 04, 2001
Posts: 1873
Hi all,

I am using ANTLR parser 2.7.5 and it fails when I have accented characters in the file.

e.g.


The error I get is,
parser exception: Test1.java:17:33: unexpected char: '''

If I remove the accented characters from the file then it works fine so I believe its these characters causing the problem.

Does anybody have any idea about how to resolve this issue?

I tried google but not much of help so far..

Regards
Maulin
Ernest Friedman-Hill
author and iconoclast
Marshal

Joined: Jul 08, 2003
Posts: 24187
    
  34

The "native2ascii" program in the JDK will translate 8-bit Java to 7-bit ASCII Java by turning those accented characters into \uxxxx escapes. You could run the code through that first.


[Jess in Action][AskingGoodQuestions]
Maulin Vasavada
Ranch Hand

Joined: Nov 04, 2001
Posts: 1873
Hi Ernest

Thanks a lot, I will try and let you know the outcome..

Regards
Maulin
Maulin Vasavada
Ranch Hand

Joined: Nov 04, 2001
Posts: 1873
Hi Ernest,

It works! Thanks.

Does this mean that developers should use this unicode things while writing the code to avoid this possible issue (so somebody else who wants to parse these files, doesn't have to use native2ascii)?

Also this solution is not really very feasible to me as I process too many files programatically and if I have to use native2ascii for each file before processing (though I would use cache once I convert but still..) it would be too much long processing. I will need to see if I can do something in ANTLR generated code for this if I can..

Regards,
Maulin
[ July 05, 2005: Message edited by: Maulin Vasavada ]
 
I agree. Here's the link: http://aspose.com/file-tools
 
subject: ANTLR failing parsing of java file on accented character