javacc - communicating from lexer to parser

Dipesh Khakhkhar

Joined: Feb 20, 2007
Posts: 1

I am trying out the following but not able to print where exactly tokens are matching. I am using jjtree to create a grammer and parser file.

I want to set some String or StringBuffer (preferably) and use it later in
parser. For example I have my grammar file like this

< DateTime: ("TIMESTAMP " ("'") ) <DATE> (" " <TIME> ? ("'")
| <DATE> (" " <TIME> ?


| < #DATE:
| <MEDMONTH> " " <SHORTDAY1> ", " <YEAR>
| <LONGDAY> ", " <LONGMONTH> " " <SHORTDAY2> ", " <YEAR>
| < #TIME:
| <HOURSHORT1> ":" <MINUTE> ":" <SECOND> " " <AMPM>
| <HOURSHORT1> ":" <MINUTE> ":" <SECOND> " " <AMPM>
< #YEAR:
> ["0"-"9"] ["0"-"9"] ["0"-"9"] ["0"-"9"]

| ["1"] ["0"-"2"]
{ sb.append("M"); System.out.println("Matching short month 1 "+sb.toString());}

| ["1"] ["0"-"2"]
{ sb.append("MM"); System.out.println("Matching short month 2 "+sb.toString()); }
"Jan" | "Feb" | "Mar" | "Apr" | "May" | "Jun" | "Jul" | "Aug" | "Sep" | "Oct" | "Nov" | "Dec"
{ sb.append("MMM"); System.out.println("Matching medium month "+sb.toString()); }
"January" | "February" | "March" | "April" | "May" | "June" | "July" | "August" | "September" | "October" | "November" | "December"
{ sb.append("MMMM"); System.out.println("Matching long month "+sb.toString()); }

and so on. (keeping it short here)

As I have shown above, I would like to store the pattern which is matching. Later I want to use this in parser.

There are few problems here.
1) First is I am not able to print whereever I have entered my print statements. But in the generated file I can see those statements in the switch case code. Am I doing something wrong while printing?

2) How do I pass something which I am setting in lexer to parser? This is because I need both the input string and the pattern which it is matching with so that I will be able to create a Date object directly.

3) Even if i get succeed in what I have tried above, I am not able to get date seperators. And when I tried to save something in stringbuffer (sb) in the internal tokens (like in # <DATE> , it didn't allow me.

I hope I am able to explain my query here.

Can somebody please inform me where do I have to set my variables to use
it in parser to find the matched token?

Any help will be appreciated.

Thank you.
Campbell Ritchie

Joined: Oct 13, 2005
Posts: 36478
Welcome to the Ranch.

Please use the code button to stop this->; ) changing to this-> .

I have done a little bit of lex-ing and parsing.
What you have written out is not a grammar in BNF (Backus-Naur format). You can't parse "January" | "February". That is the job of the lexer.

The older lexers worked in C and had a return value which represented the type of token you have just analysed. So MONTH_NAME might be one. They also have a global variable which is a union, either between a primitive type (eg an int) or a pointer which represents a String token (char *).

The object-oriented lexers written in Java take a different approach, returning an object of class Symbol with four values:-
  • type: an int which represents the terminal type of the current token.
  • value: an Object representing the semantic content of the token (eg new Integer(Integer.parseInt(yytext())), or new String(yytext());
  • left: an int representing the line number (I think)
  • right: an int representing (I think) column number.
  • Note that left and right DO NOT mean the same as they would in a TCA (three-address code).
    In the event of something like + you would . . .You have to cast the value back when it reaches the parser.
    You will find lots more details in this well-written manual for a Lexer, JFlex.
    I hope this is of some help.

