File APIs for Java Developers
Manipulate DOC, XLS, PPT, PDF and many others from your application.
The moose likes Java in General and the fly likes Help Improve Regex Parsing Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login
JavaRanch » Java Forums » Java » Java in General
Bookmark "Help Improve Regex Parsing" Watch "Help Improve Regex Parsing" New topic

Help Improve Regex Parsing

Stan James
(instanceof Sidekick)
Ranch Hand

Joined: Jan 29, 2003
Posts: 8791
A while back I posted a problem with recursive syntax and regex. I couldn't do what I really wanted so I'm doing what feels like a bit of a hack. I got regex to find the innermost macro call in a nested expression. In a loop I replace the innermost with the results of the macro call and search again. Next time I find the new innermost call.

Here's an example of nested macros. The page macro returns name, size, date etc info about a page. The linksto macro returns a list of reverse links.

${linksto ${page name}}

This works, but I'm keen to know if somebody sees a better way. I can post more code if you really want to exercise this stuff but there's a bit of a framework growing under it.

A good question is never answered. It is not a bolt to be tightened into place but a seed to be planted and to bear more seed toward the hope of greening the landscape of the idea. John Ciardi
David Harkness
Ranch Hand

Joined: Aug 07, 2003
Posts: 1646
If doing the replacements becomes a bottleneck, you could write a fairly simple and efficient parser to do the same thing, processing the String into a StringBuffer top-down.

With the iterative regex solution, each replacment builds a complete modified copy of the input String: 10 replacements in a 1k String requires roughly 10k in new -- albeit teporary -- Strings. If, however, you will only be doing a few replacements in small Strings, it may not be worth the bother.

It would be a good exercise though.
Ilja Preuss

Joined: Jul 11, 2001
Posts: 14112
Stan, that seems to be a good simple solution to me. If it doesn't show up as a performance problem, I'd simply keep it that way. (If someone has a better idea, I'd be interested too, of course!)

The soul is dyed the color of its thoughts. Think only on those things that are in line with your principles and can bear the light of day. The content of your character is your choice. Day by day, what you do is who you become. Your integrity is your destiny - it is the light that guides your way. - Heraclitus
I agree. Here's the link:
subject: Help Improve Regex Parsing
It's not a secret anymore!