File APIs for Java Developers
Manipulate DOC, XLS, PPT, PDF and many others from your application.
The moose likes Beginning Java and the fly likes String searching Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login

Win a copy of Java Interview Guide this week in the Jobs Discussion forum!
JavaRanch » Java Forums » Java » Beginning Java
Bookmark "String searching" Watch "String searching" New topic

String searching

Ben Roy
Ranch Hand

Joined: Nov 01, 2000
Posts: 70
I'm trying to implement a language filter on a message board. Essentially I'll have a list of words in a database that are flagged as inappropriate. I need to search the incoming string from the post for those words. I only need to detect if any of them exist. Of course, I can just search on each individual word with a while loop, but that seems really inefficient. Is there any way I can search a string for ANY of say 10 different words?
Barry Andrews
Ranch Hand

Joined: Sep 05, 2000
Posts: 523

I can't think of any other way to do this. If the string is not too long and you're only looking for 10 or so words, this processing should not take long. indexOf() method is pretty fast.
Does anyone else have ideas?

Art Metzer
Ranch Hand

Joined: Oct 31, 2000
Posts: 241
Hi, everyone.
Barry, two things: your code will only do the "do something" if all of the taboo words exist in the string; I think Ben wants to not post the message if any of the naughty words exists in the string.
Second, even if Ben changes the &&'s to | |'s, Java will still have to do an indexOf() on every word in the list every time, even if the first one in that list invalidates the message. Could Ben do something like this?
1. Put the list of sought words in an array of Strings.
2. Set up a boolean, badWordFound, to false. Set up an int counter to zero.
3. Go through a while loop: While badWordFound is false, AND the array still has Strings to look for in it, loop.
4. If the array's word at position "counter" is found in the message string, set badWordFound = true.
5. Increment the counter
6. **End of while piece of code**
7. if (badWordFound) do something else postMessage()
Possible, no?

[This message has been edited by Art Metzer (edited December 05, 2001).]
Ben Roy
Ranch Hand

Joined: Nov 01, 2000
Posts: 70
That is essentially what I have implemented for now. Turn's out the really interesting part of the problem is in filtering out legitimate words that contain naughty ones. Like glass, bass, mass, etc. All of those words are ok, but under the filters we've discussed here the posts would be blocked.
Ben Roy
Ranch Hand

Joined: Nov 01, 2000
Posts: 70
At first I was thinking of finding the word, then checking the char before and after it to see if they were spaces. But then...if I just checked for " " + myNaughtyWord = " " in the first place, I could save a lot of yucky mucking around.
Cindy Glass
"The Hood"

Joined: Sep 29, 2000
Posts: 8521
Originally posted by Ben Roy:
Like glass


[This message has been edited by Cindy Glass (edited December 06, 2001).]

"JavaRanch, where the deer and the Certified play" - David O'Meara
I agree. Here's the link:
subject: String searching
jQuery in Action, 3rd edition