File APIs for Java Developers
Manipulate DOC, XLS, PPT, PDF and many others from your application.
http://aspose.com/file-tools
The moose likes Java in General and the fly likes ANTLR failing parsing of java file on accented character Big Moose Saloon
  Search | Java FAQ | Recent Topics
Register / Login


Win a copy of The Mikado Method this week in the Agile and other Processes forum!
JavaRanch » Java Forums » Java » Java in General
Reply Bookmark "ANTLR failing parsing of java file on accented character" Watch "ANTLR failing parsing of java file on accented character" New topic
Author

ANTLR failing parsing of java file on accented character

Maulin Vasavada
Ranch Hand

Joined: Nov 04, 2001
Posts: 1865
Hi all,

I am using ANTLR parser 2.7.5 and it fails when I have accented characters in the file.

e.g.


The error I get is,
parser exception: Test1.java:17:33: unexpected char: '''

If I remove the accented characters from the file then it works fine so I believe its these characters causing the problem.

Does anybody have any idea about how to resolve this issue?

I tried google but not much of help so far..

Regards
Maulin


1. Have fun @ http://faq.javaranch.com/java/JavaRaq
2. Looking for simple infix2postfix conversion and postfix evaluation package? Click here
Ernest Friedman-Hill
author and iconoclast
Marshal

Joined: Jul 08, 2003
Posts: 24054
    
  13

The "native2ascii" program in the JDK will translate 8-bit Java to 7-bit ASCII Java by turning those accented characters into \uxxxx escapes. You could run the code through that first.


[Jess in Action][AskingGoodQuestions]
Maulin Vasavada
Ranch Hand

Joined: Nov 04, 2001
Posts: 1865
Hi Ernest

Thanks a lot, I will try and let you know the outcome..

Regards
Maulin
Maulin Vasavada
Ranch Hand

Joined: Nov 04, 2001
Posts: 1865
Hi Ernest,

It works! Thanks.

Does this mean that developers should use this unicode things while writing the code to avoid this possible issue (so somebody else who wants to parse these files, doesn't have to use native2ascii)?

Also this solution is not really very feasible to me as I process too many files programatically and if I have to use native2ascii for each file before processing (though I would use cache once I convert but still..) it would be too much long processing. I will need to see if I can do something in ANTLR generated code for this if I can..

Regards,
Maulin
[ July 05, 2005: Message edited by: Maulin Vasavada ]
 
I agree. Here's the link: http://ej-technologies/jprofiler - if it wasn't for jprofiler, we would need to run our stuff on 16 servers instead of 3.
 
subject: ANTLR failing parsing of java file on accented character
 
Similar Threads
A problem.........
Java Code Parser
Using Sun's Migration Tool...
Smarter Charset Conversion
Syntax Highlighting