| Author |
trying to remove javascript contents with script tags?
|
steve labar
Ranch Hand
Joined: Sep 10, 2008
Posts: 55
|
|
I'm trying to remove the javascript stuff withing the script tags in an html file. I'm having no problem removing the script tags and all the stuff inside. However, i'd like to leave the script tags and just remove the the javascript inside them. I have tried taking group(1) which is the contents and running replace(group(1),"") but it was not working consistently. The matcher.replaceall works very good but i have the darn script tags in my matcher. I thought making the script tags in non captured groups would help so i could then call matcher.replaceall but they are still in match.
Any ideas?
|
 |
Sebastian Janisch
Ranch Hand
Joined: Feb 23, 2009
Posts: 1183
|
|
I think there is no need to use regular expressions in this case. Unnecessary overhead.
|
JDBCSupport - An easy to use, light-weight JDBC framework -
|
 |
steve labar
Ranch Hand
Joined: Sep 10, 2008
Posts: 55
|
|
what if there is multiple scripts? in the file this only would get the first occurrence. So, you think using java regex is costly time wise.
|
 |
Sebastian Janisch
Ranch Hand
Joined: Feb 23, 2009
Posts: 1183
|
|
The regex engine is pretty heavy weight, so I tend to avoid it whenever possible.
As for your question, yes it only strips out the first occurance.
But you can simply loop over it until sb.indexOf("<script") is -1.
|
 |
 |
|
|
subject: trying to remove javascript contents with script tags?
|
|
|