I need to write a Java method that removes multiple occurences of a node (and its contents) from within an XML (supplied as a String).
Here's a sample of the XML
I need to remove all occurences of the element "OLifEExtension" and its contents. I've written a fairly simple method given below, it works but it is very inefficient and takes a lot of time if the XML is large (>=10 MB)
I've also tried regular expressions but can't figure one that works. I've tried the following:
None of the above regular expressions work. Instead of matching the first "OLifEExtension" element, it matches everything between the first opening "OLifEExtension" and the last ending "OLifEExtension" tag.
Can anyone please tell me a more efficient way of doing this or kindly provide me with a regular expression that will do the job for me?
Many many thanks in advance. [ December 14, 2008: Message edited by: Tausif Farooqi ]
Thanks for the suggestion Bill, but the problem is that I can't assume that the XML will be properly formatted as its coming from an external source. I can try putting line breaks between every adjecent ">" and "<" and try what you've suggested and see if it makes a difference.
Joined: Mar 03, 2007
Hi Bill, you were right about the String contatenation part! I changed the method to this:
And it runs nearly 400 times faster than the previous method! Thanks for the help! [ December 14, 2008: Message edited by: Tausif Farooqi ]
I’ve looked at a lot of different solutions, and in my humble opinion Aspose is the way to go. Here’s the link: http://aspose.com