aspose file tools*
The moose likes Java in General and the fly likes Usage of Regular expressions Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login


Win a copy of Java 8 in Action this week in the Java 8 forum!
JavaRanch » Java Forums » Java » Java in General
Bookmark "Usage of Regular expressions" Watch "Usage of Regular expressions" New topic
Author

Usage of Regular expressions

Nicholas Cheung
Ranch Hand

Joined: Nov 07, 2003
Posts: 4982
Hi Max,
I am wondering that does RE is really that useful for string matching.
When I was in my way of SCJD, and I read your SCJD book, you have tried to explain how RE can help in searching data record.
I have tried to adopt the method, but it seems to me that there are some limitations, or maybe I am too new to RE. The case I encountered is that, when a record (or a row of data) contains more than 1 data element using some delimitors which the following data format:
NAME:ADDRESS:TEL:ZIP:COUNTRY
If I really use RE, for example, in case the NAME for row 1 contains the same value of ADDRESS for row 2, and if I search for the NAME, RE may return both rows 1 and 2 to me, becos row 2 contains such a value of NAME in row 1.
I know such searching problem will not appear if it is in the DB format, but in such cases, can I specify to RE that which pattern (or location) to be matched?
Nick


SCJP 1.2, OCP 9i DBA, SCWCD 1.3, SCJP 1.4 (SAI), SCJD 1.4, SCWCD 1.4 (Beta), ICED (IBM 287, IBM 484, IBM 486), SCMAD 1.0 (Beta), SCBCD 1.3, ICSD (IBM 288), ICDBA (IBM 700, IBM 701), SCDJWS, ICSD (IBM 348), OCP 10g DBA (Beta), SCJP 5.0 (Beta), SCJA 1.0 (Beta), MCP(70-270), SCBCD 5.0 (Beta), SCJP 6.0, SCEA for JEE5 (in progress)
Jeroen Wenting
Ranch Hand

Joined: Oct 12, 2000
Posts: 5093
If you want to match only the part before the first separator (colon in your example) you could either create an RE that stops matching on reaching a colon or (which is what I'd probably do) use split(":") on the String and match only on the indexed item you are interested in.
String splitS[] = "NAME:ADDRESS:TEL:ZIP:COUNTRY".split(":") would yield a String array containing ["NAME","ADDRESS","TEL","ZIP","COUNTRY"] so if you want to match the name to a string you want you would just match splitS[0].
If you also wanted to match the telephone number separately you could do that by matching splitS[2] instead.
Your RE (which you may not need, you might be able to use contains() or equals() depending on what you're matching for) would be a lot cleaner and simpler this way.


42
Max Habibi
town drunk
( and author)
Sheriff

Joined: Jun 27, 2002
Posts: 4118
Originally posted by Nicholas Cheung:
Hi Max,
I know such searching problem will not appear if it is in the DB format, but in such cases, can I specify to RE that which pattern (or location) to be matched?
Nick

Hi Nick,
Glad to hear from you again. To answer your question: yes, there's a mechanism just for this. Actually, there are four, and they're under the general topic of lookarounds. They're not that difficult, but they're not completely trivial either, because they match position rather than existence. And yes, I go over them in the book.
If you provide a few example records, we can work backwards together, and try to see how they come to be. Deal?
All best,
M
[ April 14, 2004: Message edited by: Max Habibi ]

Java Regular Expressions
Nicholas Cheung
Ranch Hand

Joined: Nov 07, 2003
Posts: 4982
Hi Max,
Assume the following is the record format:
NAME:LOCATION:FIELD_1:FIELD_2:...:FIELD_N
For example, we have the following records:

In such case, if we use the most generic search (to see whether the string "NAME_1" exists in a string), both records will be returned.
However, I may only wanna find the NAME with "NAME_1", so, in fact, only the 1st record should be returned. But if we use the generic search, both records will be returned. In addition, we see that the NAME is not with a fixed length.
Thus, how can RE support such searching?
Thanks Max.
Nick
Tarun Ramakrishna Elankath
Greenhorn

Joined: Mar 27, 2002
Posts: 27
Originally posted by Nicholas Cheung:
[QB]Assume the following is the record format:
NAME:LOCATION:FIELD_1:FIELD_2:...:FIELD_N
For example, we have the following records:

If you happen to be certain that NAME always occurs as the first field, then whey don't you use a (^) anchor in your regex that will anchor the pattern to match only at the beginning of string ?
Pradeep bhatt
Ranch Hand

Joined: Feb 27, 2002
Posts: 8898

Are java RE based on the Unix RE?


Groovy
Jim Yingst
Wanderer
Sheriff

Joined: Jan 30, 2000
Posts: 18671
Yes, specifically it's very close to Perl (which incorporated most everything available in other unix tools and added to it). Java's regex is a little different, as described in the Pattern API (see "Comparison to Perl 5"). Possessive quantifiers are the most useful feature added by Java - expect it to appear in future versions of Perl and other languages.
[ April 15, 2004: Message edited by: Jim Yingst ]

"I'm not back." - Bill Harding, Twister
vasu maj
Ranch Hand

Joined: Jul 12, 2001
Posts: 395
Max,
You are the author of the SCJD book! The chapter on threads in the book is the best on threads that I have read anywhere. Great work indeed.
Thanks,
Vasu


What a wonderful world!
Nicholas Cheung
Ranch Hand

Joined: Nov 07, 2003
Posts: 4982
Hi Tarun,
This is just an example. In fact, I am thinking of, are they any convenient way for searching inside a text file, that regardless to the position.
i.e. If what I want to compare is the 3rd field, not the 1st field, the ^ pattern will not be useful then.
I have thought of this before, and finally, I used scoring scheme to do the generic search in the SCJD assignment. However, in fact, I really wanna know how RE can archive this.
Nick
Max Habibi
town drunk
( and author)
Sheriff

Joined: Jun 27, 2002
Posts: 4118
Originally posted by vasu maj:
Max,
You are the author of the SCJD book! The chapter on threads in the book is the best on threads that I have read anywhere. Great work indeed.
Thanks,
Vasu

Thanks Vasu
M
Max Habibi
town drunk
( and author)
Sheriff

Joined: Jun 27, 2002
Posts: 4118
Hi Nick,
There are probably 3 million ways to do this, but the following sounds like the sort of solution you wanted.
step 1. create a tmp.records file on your c: drive, consisting of the following

FIELD_R1_1:FIELD_R1_2:NEW_YORK:FIELD_R1_4:FIELD_R1_5:FIELD_R1_6:FIELD_R1_7:FIELD_R1_8
FIELD_R2_1:NEW_YORK:FIELD_R2_3:FIELD_R2_4:FIELD_R2_5:FIELD_R2_6:FIELD_R2_7:FIELD_R2_8
FIELD_R3_1:FIELD_R3_2:FIELD_R3_3:FIELD_R3_4:NEW_YORK:FIELD_R3_6:FIELD_R3_7:FIELD_R3_8
FIELD_R4_1:FIELD_R4_2:FIELD_R4_3:NEW_YORK:FIELD_R4_5:FIELD_R4_6:FIELD_R4_7:FIELD_R4_8
NEW_YORK:FIELD_R5_2:FIELD_R5_3:FIELD_R5_4:FIELD_R5_5:FIELD_R5_6:FIELD_R5_7:FIELD_R5_8
FIELD_R6_1:FIELD_R6_2:FIELD_R6_3:FIELD_R6_4:FIELD_R6_5:FIELD_R6_6:FIELD_R6_7:NEW_YORK

then the code...

HTH,
M
[ April 16, 2004: Message edited by: Max Habibi ]
Nicholas Cheung
Ranch Hand

Joined: Nov 07, 2003
Posts: 4982
Thanks a lot, Max.
Your example is exactly what I want to know.
In fact, I feel Java RE is quite similar to PERL's RE syntax.
Does this really the way SUN is doing?
BTW, as vasu said, your SCJD book is really great!
I used your book as my basis of SCJD assignment,
however, I bought your book with 70% off in a bookstore
that was performing moving clearance.
I hope you dont mind.
In fact, like Kathy and Berts, I hope you guys can write
more books on the cert. areas, so that we can prepare for
our exams more easily.
Nick
[ April 16, 2004: Message edited by: Max Habibi ]
Max Habibi
town drunk
( and author)
Sheriff

Joined: Jun 27, 2002
Posts: 4118
Originally posted by Nicholas Cheung:
Thanks a lot, Max.
Your example is exactly what I want to know.
In fact, I feel Java RE is quite similar to PERL's RE syntax.
Does this really the way SUN is doing?

That's the way it seems to me too

BTW, as vasu said, your SCJD book is really great!
I used your book as my basis of SCJD assignment,
however, I bought your book with 70% off in a bookstore
that was performing moving clearance.
I hope you dont mind.

Are you kidding? I love the fact that people are finding the book useful

In fact, like Kathy and Berts, I hope you guys can write
more books on the cert. areas, so that we can prepare for
our exams more easily.
Nick

Funny you should say that: stay tuned
M
 
I agree. Here's the link: http://aspose.com/file-tools
 
subject: Usage of Regular expressions
 
Similar Threads
Locking assumption + possible typo?
Unable to understand the data file format for URLyBird 1.3.2
Relationship among Data class, RAF instance(s), and clients
How di i insert multiple selectOneMenu in one html page?
Data File Format and reading header information