File APIs for Java Developers
Manipulate DOC, XLS, PPT, PDF and many others from your application.
http://aspose.com/file-tools
The moose likes Java in General and the fly likes how to match in java Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login
JavaRanch » Java Forums » Java » Java in General
Bookmark "how to match in java " Watch "how to match in java " New topic
Author

how to match in java

Raj kalaria
Ranch Hand

Joined: Sep 08, 2005
Posts: 72
Hi

i have one doubt

i have one string say"ABC" contain dynamic valus like
creation_date
modifed_date
object_name

i want to distinguish between the string which contains 'date' and string wich does not contain date

how to do it

if ( ABC.contains("date")
{
System.out.println("it contains date")

else
{ it does not contain date
}
Raj kalaria
Ranch Hand

Joined: Sep 08, 2005
Posts: 72
hi


I am using this but i dont know why " ("(?i).* " is used



if (("r_creation_date").matches("(?i).*date*"))
{
System.out.println("matches");
}
else
{System.out.println("Not matches");



but this code also gives me "matches" FOR r_creation_dat // without e

so is it wrong

and why do we use '("(?i in matches("(?i).*date*"))

any help i will appreciate
Tad Dicks
Ranch Hand

Joined: Nov 16, 2004
Posts: 264
Sorry If I'm not understanding but you are wanting to match on the word "date"?

Then I would use the following regular expression in your match:
^.*date.*$



-Tad
Raj kalaria
Ranch Hand

Joined: Sep 08, 2005
Posts: 72
hi

Thanks
Christopher Elkins
Ranch Hand

Joined: Oct 26, 2004
Posts: 45
Tad, just for my own education is this correct?

The regular expression ^.*date.*$

equates to

"go to the beginning of the string match any character zero or more times until you reach the string 'date' then match any character zero or more times until the end of the string is reached"


Christopher Elkins, SCJP Java 2 Platform
Stefan Wagner
Ranch Hand

Joined: Jun 02, 2003
Posts: 1923


should be enough.


would work too, since there aren't sophisticated rules involved, like in

[ October 13, 2005: Message edited by: Stefan Wagner ]

http://home.arcor.de/hirnstrom/bewerbung
Alan Moore
Ranch Hand

Joined: May 06, 2004
Posts: 262
That's the end result, but what really happens is a little more complex. The first dot-star (".*") originally consumes all the (non-line-separator) characters in the target. Then the regex engine starts backing off, trying to match the literal sequence "date". Once it finds that, the second dot-star gobbles up the remaining the characters.

It's important to understand about backtracking because, when no match is possible, the regex engine has to try every possibility before giving up. If there are a lot of dot-stars or other indeterminate components in the regex, that can easily add up to millions or billions of possibilities--and an effectively hung application.

You know what they say: with great power comes great screwupability. Or something like that.
Tad Dicks
Ranch Hand

Joined: Nov 16, 2004
Posts: 264
I'm hardly a master of Reg exps (just learned learned the usefulness and difference of lazy vs greedy matching). But yeah ^ means beginning of the line and $ is the end of a line. and .* is any character any number of times.
The regular expression ^.*date.*$
equates to
"go to the beginning of the string match any character zero or more times until you reach the string 'date' then match any character zero or more times until the end of the string is reached"


If there are a lot of dot-stars or other indeterminate components in the regex, that can easily add up to millions or billions of possibilities--and an effectively hung application


I've never witnessed that using a regexp, but I don't doubt the possibility is there a way to specify find the first match and then quit? Something I've always had issues with in doing regexps is normally I want to search an entire file and the line breaks screw it up. I've read through the sun api and it mentions setting multi line matching, but I must be missing something b/c it still doesn't work as expected. It would be conveninant if the ^ beginning of file and the $ matched EOF and then the typical /r/n or /n matched line breaks etc. To get around this I've taken to reading a file line by line and appending them into a Stringbuffer, but I can't help but wonder what is going to happen if I run it against a document with 1000 or more pages. Anyone do anything similiar to this before and have any tips? Java out of heap space errors suck.
Alan Moore
Ranch Hand

Joined: May 06, 2004
Posts: 262
Normally, the '.' metacharacter matches any character except a line separator, but you can make it match those as well by compiling the regex with the DOTALL flag set. The MULTILINE flag causes the '^' and '$' anchors to match at the beginning and end of logical lines as well as the beginning and end of the whole input.
Akshay Kiran
Ranch Hand

Joined: Aug 18, 2005
Posts: 220
hi Stefan,

Originally posted by Stefan Wagner:
[QB]
should be enough.

I don't think this one works, because .* would consume the whole line, and it will not get an opportunity to match the word "date" in the input. try it, it will only return the whole text.

Instead I would suggest that the you zoom to the first d, and then see if its a "date" or something else.


which simply means, consume all the characters before you encounter the d and then see if it is a date, if not, move on to the next d and so on...

One more thing, the .* at the end of the regex is unnecssary, I don't see any reason why we need something more than date....
if it matches date, then it will also match dateblahblahblah, makes no sense when all you need is "date"


"It's not enough that we do our best; sometimes we have to do<br />what's required."<br /> <br />-- Sir Winston Churchill
Alan Moore
Ranch Hand

Joined: May 06, 2004
Posts: 262
Akshay, Stefan's suggestion does work, and both the dot-stars (".*") are necessary. This is because we're using the matches() method to perform the test. Most regex tools define "a match" to mean that the regex describes any substring of the target string, but matches() returns true only if the regex describes the whole string. So, although we're only interested in the substring "date", we have to use the dot-stars to gobble up all the text before and after it. (There's also a find() method that looks for "a match" in the traditional sense, but you'll only find it in the class java.util.regex.Matcher.)

As for why the regex works despite the greediness of the first dot-star, see my first reply above.

Tad, could you be more specific about your problems matching within files? What exactly are you trying to do?
 
I agree. Here's the link: http://aspose.com/file-tools
 
subject: how to match in java