File APIs for Java Developers
Manipulate DOC, XLS, PPT, PDF and many others from your application.
The moose likes Java in General and the fly likes Using regex in a Recursive Descent Parser Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login

Win a copy of Android Security Essentials Live Lessons this week in the Android forum!
JavaRanch » Java Forums » Java » Java in General
Bookmark "Using regex in a Recursive Descent Parser" Watch "Using regex in a Recursive Descent Parser" New topic

Using regex in a Recursive Descent Parser

Ryan Carter-Reich

Joined: Jun 16, 2010
Posts: 3
First post.

Anyway, I've got the following grammar I'm trying to write an RDP for:

<prog> -> ["prog" [id] ][ "(" {id = (id | number) ")"] "{"<sl>"}"
<sl> -> <s> {";" <s> } ";"
<s> -> id = <expr> | <expr>
<expr> -> <term> {("+" |"-") <term>}
<term> -> <pow> {("*"|"/") <pow>}
<pow> -> <fact> {"^" <fact>}
<fact> -> "(" <expr>")" | id | number | trig "(" expr ")"

I posted this on DevShed yesterday, but it doesn't look like anyone knew what to do.

The grammar's EBNF notation. "trig" is a terminal with lexemes "sin" "cos" or "tan". "id" is a terminal that's any character followed my any number of numbers or characters, and number is any signed number.

Here's my progress on this so far:

<expr>, <term>, <pow> are all completed methods, since they're basically the same thing. Ex:

Driver.nextToken() pulls the next token, and Driver.match compares it to what's expected.

I'm having a plethora of issues with this one, so I'll just list them, and any help is greatly appreciated:

With <prog>, here's what I have not:

p() in this one parses and checks for the string "prog"

In prog I don't know how to look that sequence of ids. I don't know whether the loop should be in the switch or out of it or I should just make a new method and loop it, or just recurse. Also, should I recurse after I call id in the 'p' case?
In <sl> (not posted because there's nothing to do):
How should I determine whether a semi-colon is the last in the line or if there's an <s> after it?
And finally, in <fact> how do I differentiate between an id taht starts with a trig token and a plan trig token?

I know regex goes in here somewhere. I'm already going to use it to see if the next character is an number, then loop in a number method, etc.

I know that's a lot of questions, but thanks in advance to any help.
David Newton

Joined: Sep 29, 2008
Posts: 12617

I'm not sure how a regex is involved; how are you doing your lookahead?
Campbell Ritchie

Joined: Oct 13, 2005
Posts: 38075
What does the nextToken() method return?
Ryan Carter-Reich

Joined: Jun 16, 2010
Posts: 3
David Newton wrote:I'm not sure how a regex is involved; how are you doing your lookahead?

The only way I implement it at the moment is in <fact>, it looks while the next character matches the pattern for a digit (though this could have been done without regex). I'm not sure where else to put it, but I was told that it might be helpful.

And apologies about nextToken(); it was late and I wasn't thinking. In my original file, I input the text, then scan it, removing comments, etc, and then parse each character into an ArrayList. getNextToken() updates the nextToken by getting the first character from that ArrayList. Right now I have it removing that index, but I think I might need to just have it iterate over the list.
subject: Using regex in a Recursive Descent Parser
Similar Threads
urgent help--- fileformatting
Numeric value convert to word format
JavaCC trace code...