Win a copy of Clojure in Action this week in the Clojure forum!
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic

regex between boundary tags

 
Charles Knell
Greenhorn
Posts: 25
  • 0
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I have a text file captured from a PL/SQL query. It contains 482 XML documents with identical structures. I'd like to merge these into a single document. The root element tag of each document is "<ROWSET>". I'm looking for a regular expression that will match "<ROWSET>[any number of characters here]</ROWSET>". The expression should match each of the 482 documents and not grab all the text between the opening <ROWSET> of the first document and closing </ROWSET> of the 482nd document.

Reading and muddling has not so far produced the results I want, so I'm asking for your help.

Thanks.
 
Alan Moore
Ranch Hand
Posts: 262
  • 0
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
The simple answer is: If the documents are very large, you might find that regex to be too slow. Here's a faster version:
 
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic