| Author |
Usage of Regular expressions
|
Nicholas Cheung
Ranch Hand
Joined: Nov 07, 2003
Posts: 4982
|
|
Hi Max, I am wondering that does RE is really that useful for string matching. When I was in my way of SCJD, and I read your SCJD book, you have tried to explain how RE can help in searching data record. I have tried to adopt the method, but it seems to me that there are some limitations, or maybe I am too new to RE. The case I encountered is that, when a record (or a row of data) contains more than 1 data element using some delimitors which the following data format: NAME:ADDRESS:TEL:ZIP:COUNTRY If I really use RE, for example, in case the NAME for row 1 contains the same value of ADDRESS for row 2, and if I search for the NAME, RE may return both rows 1 and 2 to me, becos row 2 contains such a value of NAME in row 1. I know such searching problem will not appear if it is in the DB format, but in such cases, can I specify to RE that which pattern (or location) to be matched? Nick
|
SCJP 1.2, OCP 9i DBA, SCWCD 1.3, SCJP 1.4 (SAI), SCJD 1.4, SCWCD 1.4 (Beta), ICED (IBM 287, IBM 484, IBM 486), SCMAD 1.0 (Beta), SCBCD 1.3, ICSD (IBM 288), ICDBA (IBM 700, IBM 701), SCDJWS, ICSD (IBM 348), OCP 10g DBA (Beta), SCJP 5.0 (Beta), SCJA 1.0 (Beta), MCP(70-270), SCBCD 5.0 (Beta), SCJP 6.0, SCEA for JEE5 (in progress)
|
 |
Jeroen Wenting
Ranch Hand
Joined: Oct 12, 2000
Posts: 5093
|
|
If you want to match only the part before the first separator (colon in your example) you could either create an RE that stops matching on reaching a colon or (which is what I'd probably do) use split(":") on the String and match only on the indexed item you are interested in. String splitS[] = "NAME:ADDRESS:TEL:ZIP:COUNTRY".split(":") would yield a String array containing ["NAME","ADDRESS","TEL","ZIP","COUNTRY"] so if you want to match the name to a string you want you would just match splitS[0]. If you also wanted to match the telephone number separately you could do that by matching splitS[2] instead. Your RE (which you may not need, you might be able to use contains() or equals() depending on what you're matching for) would be a lot cleaner and simpler this way.
|
42
|
 |
Max Habibi
town drunk ( and author)
Sheriff
Joined: Jun 27, 2002
Posts: 4118
|
|
Originally posted by Nicholas Cheung: Hi Max, I know such searching problem will not appear if it is in the DB format, but in such cases, can I specify to RE that which pattern (or location) to be matched? Nick
Hi Nick, Glad to hear from you again. To answer your question: yes, there's a mechanism just for this. Actually, there are four, and they're under the general topic of lookarounds. They're not that difficult, but they're not completely trivial either, because they match position rather than existence. And yes, I go over them in the book. If you provide a few example records, we can work backwards together, and try to see how they come to be. Deal? All best, M [ April 14, 2004: Message edited by: Max Habibi ]
|
Java Regular Expressions
|
 |
Nicholas Cheung
Ranch Hand
Joined: Nov 07, 2003
Posts: 4982
|
|
Hi Max, Assume the following is the record format: NAME:LOCATION:FIELD_1:FIELD_2:...:FIELD_N For example, we have the following records: In such case, if we use the most generic search (to see whether the string "NAME_1" exists in a string), both records will be returned. However, I may only wanna find the NAME with "NAME_1", so, in fact, only the 1st record should be returned. But if we use the generic search, both records will be returned. In addition, we see that the NAME is not with a fixed length. Thus, how can RE support such searching? Thanks Max. Nick
|
 |
Tarun Ramakrishna Elankath
Greenhorn
Joined: Mar 27, 2002
Posts: 27
|
|
Originally posted by Nicholas Cheung: [QB]Assume the following is the record format: NAME:LOCATION:FIELD_1:FIELD_2:...:FIELD_N For example, we have the following records: If you happen to be certain that NAME always occurs as the first field, then whey don't you use a (^) anchor in your regex that will anchor the pattern to match only at the beginning of string ?
|
 |
Pradeep bhatt
Ranch Hand
Joined: Feb 27, 2002
Posts: 8876
|
|
|
Are java RE based on the Unix RE?
|
Groovy
|
 |
Jim Yingst
Wanderer
Sheriff
Joined: Jan 30, 2000
Posts: 18670
|
|
Yes, specifically it's very close to Perl (which incorporated most everything available in other unix tools and added to it). Java's regex is a little different, as described in the Pattern API (see "Comparison to Perl 5"). Possessive quantifiers are the most useful feature added by Java - expect it to appear in future versions of Perl and other languages. [ April 15, 2004: Message edited by: Jim Yingst ]
|
"I'm not back." - Bill Harding, Twister
|
 |
vasu maj
Ranch Hand
Joined: Jul 12, 2001
Posts: 393
|
|
Max, You are the author of the SCJD book! The chapter on threads in the book is the best on threads that I have read anywhere. Great work indeed. Thanks, Vasu
|
What a wonderful world!
|
 |
Nicholas Cheung
Ranch Hand
Joined: Nov 07, 2003
Posts: 4982
|
|
Hi Tarun, This is just an example. In fact, I am thinking of, are they any convenient way for searching inside a text file, that regardless to the position. i.e. If what I want to compare is the 3rd field, not the 1st field, the ^ pattern will not be useful then. I have thought of this before, and finally, I used scoring scheme to do the generic search in the SCJD assignment. However, in fact, I really wanna know how RE can archive this. Nick
|
 |
Max Habibi
town drunk ( and author)
Sheriff
Joined: Jun 27, 2002
Posts: 4118
|
|
Originally posted by vasu maj: Max, You are the author of the SCJD book! The chapter on threads in the book is the best on threads that I have read anywhere. Great work  indeed. Thanks, Vasu
Thanks Vasu M
|
 |
Max Habibi
town drunk ( and author)
Sheriff
Joined: Jun 27, 2002
Posts: 4118
|
|
Hi Nick, There are probably 3 million ways to do this, but the following sounds like the sort of solution you wanted. step 1. create a tmp.records file on your c: drive, consisting of the following FIELD_R1_1:FIELD_R1_2:NEW_YORK:FIELD_R1_4:FIELD_R1_5:FIELD_R1_6:FIELD_R1_7:FIELD_R1_8 FIELD_R2_1:NEW_YORK:FIELD_R2_3:FIELD_R2_4:FIELD_R2_5:FIELD_R2_6:FIELD_R2_7:FIELD_R2_8 FIELD_R3_1:FIELD_R3_2:FIELD_R3_3:FIELD_R3_4:NEW_YORK:FIELD_R3_6:FIELD_R3_7:FIELD_R3_8 FIELD_R4_1:FIELD_R4_2:FIELD_R4_3:NEW_YORK:FIELD_R4_5:FIELD_R4_6:FIELD_R4_7:FIELD_R4_8 NEW_YORK:FIELD_R5_2:FIELD_R5_3:FIELD_R5_4:FIELD_R5_5:FIELD_R5_6:FIELD_R5_7:FIELD_R5_8 FIELD_R6_1:FIELD_R6_2:FIELD_R6_3:FIELD_R6_4:FIELD_R6_5:FIELD_R6_6:FIELD_R6_7:NEW_YORK then the code... HTH, M [ April 16, 2004: Message edited by: Max Habibi ]
|
 |
Nicholas Cheung
Ranch Hand
Joined: Nov 07, 2003
Posts: 4982
|
|
Thanks a lot, Max. Your example is exactly what I want to know. In fact, I feel Java RE is quite similar to PERL's RE syntax. Does this really the way SUN is doing? BTW, as vasu said, your SCJD book is really great! I used your book as my basis of SCJD assignment, however, I bought your book with 70% off in a bookstore that was performing moving clearance. I hope you dont mind. In fact, like Kathy and Berts, I hope you guys can write more books on the cert. areas, so that we can prepare for our exams more easily. Nick [ April 16, 2004: Message edited by: Max Habibi ]
|
 |
Max Habibi
town drunk ( and author)
Sheriff
Joined: Jun 27, 2002
Posts: 4118
|
|
Originally posted by Nicholas Cheung: Thanks a lot, Max. Your example is exactly what I want to know. In fact, I feel Java RE is quite similar to PERL's RE syntax. Does this really the way SUN is doing? That's the way it seems to me too BTW, as vasu said, your SCJD book is really great! I used your book as my basis of SCJD assignment, however, I bought your book with 70% off in a bookstore that was performing moving clearance. I hope you dont mind. Are you kidding? I love the fact that people are finding the book useful In fact, like Kathy and Berts, I hope you guys can write more books on the cert. areas, so that we can prepare for our exams more easily. Nick Funny you should say that: stay tuned M
|
 |
 |
|
|
subject: Usage of Regular expressions
|
|
|