This week's book giveaway is in the OO, Patterns, UML and Refactoring forum. We're giving away four copies of Refactoring for Software Design Smells: Managing Technical Debt and have Girish Suryanarayana, Ganesh Samarthyam & Tushar Sharma on-line! See this thread for details.
Do you know how to search a file? Basically search through the file, looking for the stuff you want to remove (or the first line you want to keep). You'll need to create a temporary file to copy the contents you want to keep from the original, and then when you're done, write the new stuff to the original file (or write to a new file).
Studying for SCJP 6
Joined: Sep 15, 2008
That's not a problem. I already have the file and the content is in the string. Not i just need to create Regular Expression to remove it using .removeAll() function but I don't know how to create that RegEx.
Maksim, You are correct that using a regular expression is the best way to approach this. Whenever I use regular expressions, I start out small and make sure my regular expression does the same thing at each step.
For example, can you write a regular expression to: 1) Remove <head>? 2) Remove <head>...</head>? 3) Remove <body withABunchOfAttributes>? 3) Remove </body>? 4) Combine steps 2-4? (hint - you need to use grouping parens for this one if you want to do it one regular expression)
This sounds like a strange requirement. Do you really want to remove all the HTML rather than just the head and body tags? In particular do you want the <html> and <table> tags present?
Also, take a look at the Pattern.DOT_ALL flag since you are matching across multiple lines. I know about this flag, use it frequently and still manage to forget it on my first shot most of the time.
Maksim, Are you trying to delete everything between the head tags? (I think that's what you are trying to accomplish, but the reg exp is way too complicated for that. So then I second guessed my understanding.)
This matches everything between the head tags regardless of what is in between: