wood burning stoves 2.0*
The moose likes Java in General and the fly likes I don't think I understand how Regular Expression works! Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login


Win a copy of Android Security Essentials Live Lessons this week in the Android forum!
JavaRanch » Java Forums » Java » Java in General
Bookmark "I don Watch "I don New topic
Author

I don't think I understand how Regular Expression works!

JiaPei Jen
Ranch Hand

Joined: Nov 19, 2000
Posts: 1309
I have been trying to use the Regular Expression to search a piece of given text in a document.
Because the nature of search does not involve a "pattern" -- For example: when user gives a literal string "Bill", I am supposed to find if "Bill" appears in the source document. -- therefore, I use "Bill" as both the pattern and the input provided by the user.

This Java program outputs "Found A Match" to the console when there are "Bill" strings in the source document.
However, this Java program still outputs "Found A Match" when there is no such a string "Bill" in the source document.
Now, how should I apply the Regular Expression package to achieve my purpose? Should I use Regular Expression to do it?
Jim Yingst
Wanderer
Sheriff

Joined: Jan 30, 2000
Posts: 18671
Usually when you're posting a question that's just an extension of a previous question, it's better to append your post to the previous conversation, so that we can see what's gone before, and have the complete context. In this case, that was this here. However now it seems easier to close the old thread and continue here, so that's what I'm doing...
The main problem is still the same as it was:
therefore, I use "Bill" as both the pattern and the input provided by the user.
This makes no sense. You're searching "Bill" to see if it contains the substring "Bill". Yes, it does. But what about the file you were planning on reading? You created a reader:

and later you close it:

In between, it seems you need to actually do something with this reader, and use it to form the userInput that you search with the Matcher.
[ December 11, 2002: Message edited by: Jim Yingst ]

"I'm not back." - Bill Harding, Twister
JiaPei Jen
Ranch Hand

Joined: Nov 19, 2000
Posts: 1309
I changed the Java program in accordance with my understanding of your reply. And I decided to read the source document line by line. Nonetheless, the program still outputs "Found A Match" while I know there is "no match" in the source document.
Jim Yingst
Wanderer
Sheriff

Joined: Jan 30, 2000
Posts: 18671

You're reading in a line, and checking to see if it's null - but you're not remembering the value after this line. It's forgotten. The Matcher is still searching userInput, which is still equal to "Bill". Try putting the new line in the userInput variable so it can be used later:
[ December 11, 2002: Message edited by: Jim Yingst ]
JiaPei Jen
Ranch Hand

Joined: Nov 19, 2000
Posts: 1309
The code is getting close to be right. Now,
when there is no match in the source document, the code outputs "Found A Match" twice to the console.
When there is "one" match in the source document,
the code outputs "Found A Match" three times to the console.
When there are "two" matches in the source document, the code outputs "Found A Match" four times to the console.
When there are "three" matches in the source document, the code outputs "Found A Match" five times to the console.
and so on.
Just wonder where those two extra "Found A Match" come from.
Jim Yingst
Wanderer
Sheriff

Joined: Jan 30, 2000
Posts: 18671
The fact that you're reading a .doc file is suspicious - if it was generated by MS Word, it is not a normal text file, and its contents may be very different than you expect. I'd recommend inserting a line like
inside the while loop, to see what you've really read.
Additionally, you can make use of additional methods of Matcher to get more information about exactly where each match has been found:

Studying the output of these print statements may give you clues about what's really going on...
[ December 12, 2002: Message edited by: Jim Yingst ]
JiaPei Jen
Ranch Hand

Joined: Nov 19, 2000
Posts: 1309
Jim, thanks for your help very, very much.
My mistake: I should not use a MS generated document to do the test.
I made a real .txt file, and the program ran without any problem.
Thank you for your support during the past two days.
Jim Yingst
Wanderer
Sheriff

Joined: Jan 30, 2000
Posts: 18671
You're welcome.
 
I agree. Here's the link: http://aspose.com/file-tools
 
subject: I don't think I understand how Regular Expression works!
 
Similar Threads
Regular Expression Package -- Uncertain If I Did It Right
no such element? eek!
Does Not Write To A File
string extraction using regular expressions
regular exression for concatenating string in 2 different lines into the same line