• Post Reply
  • Bookmark Topic Watch Topic
  • New Topic

how to string search ?

 
mark stone
Ranch Hand
Posts: 417
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
A text file conains strings, each string contains some letters (a-z).
i need to search for strings whose first four letters are a match.
how to do this programmatically ?
 
Dirk Schreckmann
Sheriff
Posts: 7023
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
What have you thought of and/or tried so far?
 
mark stone
Ranch Hand
Posts: 417
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Originally posted by Dirk Schreckmann:
What have you thought of and/or tried so far?

i would construct arrays out of each string. and then compare the arrays.
but i need to know as how to be able to read the strings from the text file ?
 
Abdullah Javid
Greenhorn
Posts: 12
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Your question is not so clear. However I understood that you have a String object and wants to check whether it starts with some string(group of characters) or not.
To perform this task, you would have to use startsWith() method defined in String class, which has the following definition:
public boolean startsWith(String prefix)

As indicated above, it would return either true or false.
Sincerely,
Abdullah.
 
Frank Carver
Sheriff
Posts: 6920
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
i would construct arrays out of each string. and then compare the arrays.
There's a much simpler way of doing just what you want. Looking it up is a fine exercise in using the published Java APIs. Checkout http://java.sun.com/j2se/1.3/docs/api/ .
 
Dirk Schreckmann
Sheriff
Posts: 7023
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Let's clarify the situation a bit.
Do your text files contain text or do you have files that actually contain String objects and were probably created using an ObjectOutputStream?
If you have text files that contain text, then how do you want the text to break up into Strings? Do you want a String to be created by text in the file that is surrounded by whitespace? Do you want a String to be a line of text in the file? Do you want a String to be the entire contents of the file?
 
mark stone
Ranch Hand
Posts: 417
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
(i have text file that contains text.....)
Do you want a String to be created by text in the file that is surrounded by whitespace ?
YES
 
Dirk Schreckmann
Sheriff
Posts: 7023
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Well, why didn't you just say so?
Is I/O in Java new for you? If so, I'd recommend that you take a look at The I/O: Reading and Writing (but no 'rithmetic) Lesson of Sun's Java Tutorial.
Or are you already comfortable reading in characters from a file and you're just trying to figure out a way to break the incoming characters into words (Strings)?
Or is the I/O part not really the issue and are you just asking how to match a certain word based on the first four characters, as you mentioned?
[ August 30, 2002: Message edited by: Dirk Schreckmann ]
 
mark stone
Ranch Hand
Posts: 417
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Dirk,
with due respect to your status as bartender for this forum.
when i had posted this question (back in July), your first reaction was "what have you thought of..."
well since then i see that you have started your regular expressions stuff. well indeed the question i had posted then was related to regular expressions ! (which are now supported in jdk 1.4)
hopefully now when questions like this come up, they are recognised as being related to regular expressions.
thank you
Mark
Originally posted by Dirk Schreckmann:
What have you thought of and/or tried so far?
 
John Dale
Ranch Hand
Posts: 399
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Wow.
 
Dirk Schreckmann
Sheriff
Posts: 7023
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
So... Are you now asking how to harvest the power of regular expressions in order to solve the problem? Did you already figure out a solution? If so, why not share what you learned with anybody that happens by this thread and tell us of your solution?
 
Thomas Paul
mister krabs
Ranch Hand
Posts: 13974
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I would have done it without regular expressions.
 
Dirk Schreckmann
Sheriff
Posts: 7023
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Thomas,
I'm curious as to what approach you'd take.
 
mark stone
Ranch Hand
Posts: 417
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Dirk,
its pretty clear what i had meant.

thank you
Originally posted by Dirk Schreckmann:
So... Are you now asking how to harvest the power of regular expressions in order to solve the problem? Did you already figure out a solution? If so, why not share what you learned with anybody that happens by this thread and tell us of your solution?
 
Thomas Paul
mister krabs
Ranch Hand
Posts: 13974
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Originally posted by Dirk Schreckmann:
Thomas,
I'm curious as to what approach you'd take.

I would have gone with StringTokenizer.
 
Dirk Schreckmann
Sheriff
Posts: 7023
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
OK, StringTokenizer to break apart the input into words. Then to match the first four characters of each word, if we're dealing with only one character sequence, String's startsWith( String ) method would probably work well. But what if we have a set of ten different four character sequences that a word is allowed to begin with? I could imagine a straightforward regular expression to use, and I could imagine checking some data structure (such as a List or Map) that contained the set of character sequences allowed against the first four characters of each word, but what would you do? I think that the simple answer is to do whatever you'd prefer to do, then if a performance issue arises, profile and test, but I'm still curious as to how an experienced engineer like Thomas might approach the problem.
[ August 30, 2002: Message edited by: Dirk Schreckmann ]
 
Ilja Preuss
author
Sheriff
Posts: 14112
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Originally posted by mark stone:
Dirk,
its pretty clear what i had meant.

No, it's far from clear to me (and I think we can trust Dirk when he says that it isn't clear to him). For me it is even far from clear wether regular expressions really are a suggestive solution to your problem - as even your problem isn't perfectly clear.
If you want others to be more supportive, it could help to be more detailed when asking questions.
 
Thomas Paul
mister krabs
Ranch Hand
Posts: 13974
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Given more complex requirements than originally described I would have gone with regular expressions. With the single match requirements the startsWith() works well and avoids having to rely on a particular version of Java or an outside package.
 
Marilyn de Queiroz
Sheriff
Posts: 9059
12
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
A text file conains strings, each string contains some letters (a-z).

i need to search for strings whose first four letters are a match.


If you just want to search for Strings whose first four letters are letters from a to z as your original question suggests, you could use String.regionMatches(), you could use String.compareTo(), you could use Character.isLetter()... I don't see that your original question obviously refers to regular expressions.

"What have you thought of and/or tried so far?" is a pretty standard question to try to get more information to help you solve the problem.
 
John Dale
Ranch Hand
Posts: 399
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
In the responses to the question and its clarification, I think I see as many different interpretations of the task as I see respondants, maybe more (not to mention my own). Little is clear except the need for clarification.
 
Guy Allard
Ranch Hand
Posts: 776
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I'd use REs in any language or environment that supports them.
StringStokenizer should be used with a lot of caution - my personal opinion is that it is brain dead.
This code:
String s = "a,,c";
StringTokenizer t = new StringTokenizer(s,",");
int i=0;
while(t.hasMoreTokens()) {
++i;
System.out.println("Next: " + i + " " + t.nextToken());
}
does not yield what one would want or expect.
Guy
 
Thomas Paul
mister krabs
Ranch Hand
Posts: 13974
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Originally posted by Guy Allard:
does not yield what one would want or expect.

I would expect it to yield exactly two tokens and that is what it does. If I wanted to know about the two delimeters next to each other, I would use the form of the constructor that takes a boolean as the final parameter. In the case as defined here, StringTokenizer works perfectly since multiple delimeters (blank spaces) would be ignored.
[ August 31, 2002: Message edited by: Thomas Paul ]
 
Consider Paul's rocket mass heater.
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic