aspose file tools*
The moose likes Java in General and the fly likes Regex help Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login


Win a copy of EJB 3 in Action this week in the EJB and other Java EE Technologies forum!
JavaRanch » Java Forums » Java » Java in General
Bookmark "Regex help" Watch "Regex help" New topic
Author

Regex help

Gary Goldsmith
Ranch Hand

Joined: Mar 06, 2007
Posts: 30
I'm trying to print out the 1st word from the 1st line of a text file, the 2nd word from the 2nd line and so on. If the nth line has less than n words, then print out the last word on the line.

My code so far is this:



To do what I want, I know I should use the split method found in the string class, but I'm unsure how to write the regex to do what I want to do.
Henry Wong
author
Sheriff

Joined: Sep 28, 2004
Posts: 18108
    
  39

To do what I want, I know I should use the split method found in the string class, but I'm unsure how to write the regex to do what I want to do.


The regex to the split() method simply defines the delimiter that is use to separate the words. For example...

If all the words are separated by one space, then you can use " ".

If all the words are separated by one or more spaces, then you can use " +".

If all the words are separated by any white space character, then you can use "\s".

If all the words are separated by one or more of any white space character, then you can use "\s+".

Etc... well, you get the point.

Henry


Books: Java Threads, 3rd Edition, Jini in a Nutshell, and Java Gems (contributor)
Ulf Dittmer
Marshal

Joined: Mar 22, 2005
Posts: 39547
    
  27
You may not have to use regexps at all. Depending on how the "words" of a line are separated from each other, a StringTokenizer may do. But then, you don't have to use regexps with String.split - e.g., a regexp of " " will split at each space character.


Ping & DNS - updated with new look and Ping home screen widget
Gary Goldsmith
Ranch Hand

Joined: Mar 06, 2007
Posts: 30
Ok I'm getting somewhere:



This works fine to a point! when the next line only has for example 3 words. But num is greater than 3 I get "file input error" message. How do I print out the last word of that line?
Ulf Dittmer
Marshal

Joined: Mar 22, 2005
Posts: 39547
    
  27
Well, as it is, the code expects that line [n] has at least [n+1] words. You can get the length of the returned array through "line.split(...).length".
Gary Goldsmith
Ranch Hand

Joined: Mar 06, 2007
Posts: 30
Sorry, but this has me stumped. I've tried the following:


I've tried other if statements but still get "File input error".
Ulf Dittmer
Marshal

Joined: Mar 22, 2005
Posts: 39547
    
  27
I don't really understand what the code is supposed to do. E.g., "num" seems to be the same as "i" since it is incremented for each line - is that by design?

In general, the code should proceed until there are no more lines in the file, so you might have code like:

[ October 13, 2007: Message edited by: Ulf Dittmer ]
Gary Goldsmith
Ranch Hand

Joined: Mar 06, 2007
Posts: 30


numOfWords is the number of lines the code should go through, so the first 20 lines only.

"num" is the line number, the idea of the code is to print out the 1st word from the 1st line of a text file, the 2nd word from the 2nd line and so on. So I increment num on each loop, which gives me the next line, "num" is then used as the number of words in from the start, i.e. if num is 3 then its line 3 and word 3 of that line. The problem is that if line 6 has 3 words then reading in the 6th word gives an error, and I'm trying to get it to print the last word of the line.
Gary Goldsmith
Ranch Hand

Joined: Mar 06, 2007
Posts: 30
I've worked it out now, the problem was that when I was using the length of the line as the number to be the number of the last word! I forgot that it started at 0, so if there are 5 words on a line its actually 0, 1, 2, 3, 4. And I was saying get word number 5 which didn't exist! Thanks you for your help.
Henry Wong
author
Sheriff

Joined: Sep 28, 2004
Posts: 18108
    
  39

You code is incredibly inefficient. You execute a split(), to get an array result, just so that you can find out how many fields there are. Then you run a split(), to get an array result, for each index of the array.

Wouldn't it be better to call split() once, store the result array, and then iterate that result set?

Henry
Jim Yingst
Wanderer
Sheriff

Joined: Jan 30, 2000
Posts: 18671
I think this here is a big problem:

The problem is that "File input error" is not informative about what sort of error has occurred. You've lost important information here: the name of the Exception, the error message, and the stack trace. All three of these can be printed easily with the following:

In the future, debugging will be much easier with this sort of info at hand.


"I'm not back." - Bill Harding, Twister
 
I agree. Here's the link: http://aspose.com/file-tools
 
subject: Regex help
 
Similar Threads
FileReader issues
i cant ...... learn to use the split
How can I create an object based on input
Print Word Document Programatically in Java
doubt in pattern matching and comparing files