File APIs for Java Developers
Manipulate DOC, XLS, PPT, PDF and many others from your application.
The moose likes Java in General and the fly likes Regex Query Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login
JavaRanch » Java Forums » Java » Java in General
Bookmark "Regex Query" Watch "Regex Query" New topic

Regex Query

Dave Hewy
Ranch Hand

Joined: Aug 21, 2003
Posts: 93
I'm trying to parse a set of SQL statements, which have already have comments etc stripped off. I was using a really simple regex just to split each statment by the semicolon I know is between each one, but some more complex SQL contain semicolons as part of Oracle TRANSLATE operations, these are in single quotes, but how can i make my expression ignore only these?

Dummy example...

So this should provide 3 matches and ignore the semicolon embedded in the second SQL statement.


Stan James
(instanceof Sidekick)
Ranch Hand

Joined: Jan 29, 2003
Posts: 8791
If you always have newline after the semicolon as shown, you can split on ;$ instead of ;. You have to compile pattern with an option MULTILINE.

Otherwise I'd try some pattern that allows '.*;.*' before the terminating ;

Do you have an interactive RegEx tool? There's one that plugs into Eclipse, and I have used RegExCoach a couple times. That kind of thing is helpful to evolve expressions without a whole edit-compile-test cycle.

A good question is never answered. It is not a bolt to be tightened into place but a seed to be planted and to bear more seed toward the hope of greening the landscape of the idea. John Ciardi
Dave Hewy
Ranch Hand

Joined: Aug 21, 2003
Posts: 93
thanks for your reply, I can't guarantee that ; will always be followed by a line break so cannot use that as the trigger to split the data.

I do have the Eclipse plugin, so I'll have to try a few things out with that.

Thanks anyway

Jim Yingst

Joined: Jan 30, 2000
Posts: 18671
Let's see... Finding a quoted sting is a bit complex, because a SQL string literal can contain two consecutive quotes, as an escape mechanism for including a quote within the literal (the SQL equivalent of Java's '\'.) Here's a regex to find a single-quoted string, using Pattern.COMMENTS to allow comments and whitespace, for readability:

This uses posessive matching (the ++ and *+), a java.util.regex feature not (yet) available in other languages. It reduces to:

Now we can use this pattern to build a larger pattern for the whole statement:


Hope that helps...
[ December 02, 2004: Message edited by: Jim Yingst ]

"I'm not back." - Bill Harding, Twister
Dave Hewy
Ranch Hand

Joined: Aug 21, 2003
Posts: 93

that's exactly what I want, it works like a charm.

Many thanks

I agree. Here's the link:
subject: Regex Query
It's not a secret anymore!