Win a copy of Clojure in Action this week in the Clojure forum!
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic

Regex Query

 
Dave Hewy
Ranch Hand
Posts: 93
  • 0
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi,
I'm trying to parse a set of SQL statements, which have already have comments etc stripped off. I was using a really simple regex just to split each statment by the semicolon I know is between each one, but some more complex SQL contain semicolons as part of Oracle TRANSLATE operations, these are in single quotes, but how can i make my expression ignore only these?

Dummy example...



So this should provide 3 matches and ignore the semicolon embedded in the second SQL statement.

Thanks

Dave.
 
Stan James
(instanceof Sidekick)
Ranch Hand
Posts: 8791
  • 0
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
If you always have newline after the semicolon as shown, you can split on ;$ instead of ;. You have to compile pattern with an option MULTILINE.

Otherwise I'd try some pattern that allows '.*;.*' before the terminating ;

Do you have an interactive RegEx tool? There's one that plugs into Eclipse, and I have used RegExCoach a couple times. That kind of thing is helpful to evolve expressions without a whole edit-compile-test cycle.
 
Dave Hewy
Ranch Hand
Posts: 93
  • 0
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
thanks for your reply, I can't guarantee that ; will always be followed by a line break so cannot use that as the trigger to split the data.

I do have the Eclipse plugin, so I'll have to try a few things out with that.

Thanks anyway

Dave.
 
Jim Yingst
Wanderer
Sheriff
Posts: 18671
  • 0
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Let's see... Finding a quoted sting is a bit complex, because a SQL string literal can contain two consecutive quotes, as an escape mechanism for including a quote within the literal (the SQL equivalent of Java's '\'.) Here's a regex to find a single-quoted string, using Pattern.COMMENTS to allow comments and whitespace, for readability:

This uses posessive matching (the ++ and *+), a java.util.regex feature not (yet) available in other languages. It reduces to:

Now we can use this pattern to build a larger pattern for the whole statement:

or:

Hope that helps...
[ December 02, 2004: Message edited by: Jim Yingst ]
 
Dave Hewy
Ranch Hand
Posts: 93
  • 0
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Jim,

that's exactly what I want, it works like a charm.

Many thanks

Dave.
 
I agree. Here's the link: http://aspose.com/file-tools
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic