aspose file tools*
The moose likes Programmer Certification (SCJP/OCPJP) and the fly likes Question About Regex Chapter-6 K&B Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login
JavaRanch » Java Forums » Certification » Programmer Certification (SCJP/OCPJP)
Bookmark "Question About Regex Chapter-6 K&B" Watch "Question About Regex Chapter-6 K&B" New topic
Author

Question About Regex Chapter-6 K&B

Been Zaidi
Greenhorn

Joined: Mar 21, 2012
Posts: 8

Hi.
I am reading Chapter-6 from Kathy and Bert Book. I am on the topic of quantifiers. Here is an example from K&B to use
a regular expression to find all file names starting with proj1. Here is the example


"proj3.txt,proj1sched.pdf,proj1,proj2,proj1.java"


Regular expression give to find such a combination is



It states that the key part to the expression is to use zero or more findings of characters that is not a ,.
It doesn't give any character regex like \w then how come we can state find zero or more occurance of characters that is not a ,.
Can someone elaborate please?

Best Regards,


Ben, My Blog
Henry Wong
author
Sheriff

Joined: Sep 28, 2004
Posts: 18989
    
  40

Been Zaidi wrote:Hi.
I am reading Chapter-6 from Kathy and Bert Book. I am on the topic of quantifiers. Here is an example from K&B to use
a regular expression to find all file names starting with proj1. Here is the example


"proj3.txt,proj1sched.pdf,proj1,proj2,proj1.java"


Regular expression give to find such a combination is



It states that the key part to the expression is to use zero or more findings of characters that is not a ,.
It doesn't give any character regex like \w then how come we can state find zero or more occurance of characters that is not a ,.
Can someone elaborate please?


Best Regards,



Hint: Are there any word characters are also not "not a comma"?


Books: Java Threads, 3rd Edition, Jini in a Nutshell, and Java Gems (contributor)
Philip Thamaravelil
Ranch Hand

Joined: Feb 09, 2006
Posts: 99
Been,
You don't need to specify the \w. You are matching the rest of the text by matching "Not ,"

Such that, when the expression reaches the "," it stops matching.


* - Zero or more instances of the previous character.

() - Stores the matching value

[ ] - instances of various expressions treated as a single character.

Make sense?

Cheers,
Philip
Been Zaidi
Greenhorn

Joined: Mar 21, 2012
Posts: 8

Dear Philip,

Your post was very helpful. So basically the key is this expression. Please correct me if i am wrong. When we write



It basically means that it can be anything except a comma. It can be a digit, character or anything but it shouldn't be a
comma. Plus * means zero or more ocurances. Please correct me if i am wrong.

Thanks,
Been
Philip Thamaravelil
Ranch Hand

Joined: Feb 09, 2006
Posts: 99
Been Zaidi wrote:



It basically means that it can be anything except a comma. It can be a digit, character or anything but it shouldn't be a
comma. Plus * means zero or more ocurances. Please correct me if i am wrong.


Terrific! Glad it helped. You are correct. To note, this expression stores the value for access, but if you only need to match the expression this works as well:






Been Zaidi
Greenhorn

Joined: Mar 21, 2012
Posts: 8

Dear Philip,

Thanks. I have yet another last confusion. When we say it with (), it stores the matching value. Can you elaborate over this a little.
With use of parenthesis and without use of it.

Thanks a ton,
Been
Henry Wong
author
Sheriff

Joined: Sep 28, 2004
Posts: 18989
    
  40

Been Zaidi wrote:Thanks. I have yet another last confusion. When we say it with (), it stores the matching value. Can you elaborate over this a little.
With use of parenthesis and without use of it.


Parens define groups in a regular expression -- and you can actually fetch the sub-match (the match within the paren) by using the group() method call. Group zero is the matched string, while group 1, group2, etc., are determined by the parens.

Having said that, groups which are followed by a qualifier don't work well IMO. In this example, there is only one sub group (which is group 1), and it will the last match (the character right before the comma). There isn't really a good way to get all the submatches using groups that are followed by qualifiers.

Henry

Been Zaidi
Greenhorn

Joined: Mar 21, 2012
Posts: 8

Dear Henry,

Thanks for clarification. It helped me clarify my concept.

Thanks,
Been
 
It is sorta covered in the JavaRanch Style Guide.
 
subject: Question About Regex Chapter-6 K&B