Granny's Programming Pearls
"inside of every large program is a small program struggling to get out"
JavaRanch.com/granny.jsp
The moose likes Programmer Certification (SCJP/OCPJP) and the fly likes Regular expressions and the split() method Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login
JavaRanch » Java Forums » Certification » Programmer Certification (SCJP/OCPJP)
Bookmark "Regular expressions and the split() method" Watch "Regular expressions and the split() method" New topic
Author

Regular expressions and the split() method

Sidharth Khattri
Ranch Hand

Joined: Sep 16, 2013
Posts: 121

I thought I could ask here instead of making a new thread. I'll make a new thread if I get no response :/
I still don't understand the concept.

I wrote this little program:


Here are few of the outputs from the following command line invocations:
1) With java LogSplitter "a" "\w"
Output:

0

2) With java LogSplitter "a " "\w"
Output:
><> <
2

Now, why does the second invocation return an empty token between -1 and 0 along with space following a in "a "
and the first invocation doesn't return an empty invocation between -1 and 0?

3) Although with java LogSplitter "a" "\d"
Output is:
>a<
1
why does it return the token >a< even when there's no digit in the string? And it returned 0 in the first invocation?

4) With java LogSplitter "" "\w"
Output:
><
1
why does it return an empty string when there's nothing in the string?

5) With java LogSplitter " a" "\s"
Output:
><><>a<
3
What's up when using "\s"?

WHAT IS THE LOGIC BEHIND SPLIT?


OCPJP 6 - 96%
Henry Wong
author
Sheriff

Joined: Sep 28, 2004
Posts: 18523
    
  40

Sidharth Khattri wrote:I thought I could ask here instead of making a new thread. I'll make a new thread if I get no response :/
I still don't understand the concept.


Yea, let's move this to a new topic instead of confusing the other topic. Also, you no longer have to wait for "no response".

Henry


Books: Java Threads, 3rd Edition, Jini in a Nutshell, and Java Gems (contributor)
Sidharth Khattri
Ranch Hand

Joined: Sep 16, 2013
Posts: 121

Henry Wong wrote:
Sidharth Khattri wrote:I thought I could ask here instead of making a new thread. I'll make a new thread if I get no response :/
I still don't understand the concept.


Yea, let's move this to a new topic instead of confusing the other topic. Also, you no longer have to wait for "no response".

Henry


Thank you for moving this to a new thread. I never wanted to confuse anyone though.
Would love to get a response
Henry Wong
author
Sheriff

Joined: Sep 28, 2004
Posts: 18523
    
  40

Sidharth Khattri wrote:I thought I could ask here instead of making a new thread. I'll make a new thread if I get no response :/
I still don't understand the concept.

I wrote this little program:


Here are few of the outputs from the following command line invocations:
1) With java LogSplitter "a" "\w"
Output:

0


A regex of "\w" is a word character -- so a single word character is the delimiter (when using split). With a string of "a", the letter "a" is the delimiter -- yielding two components which are both zero length strings.

However, with the version of split(), that takes a single string (the delimiter), all trailing zero-length parts are removed. This means that there are no components after the split.

Henry
Henry Wong
author
Sheriff

Joined: Sep 28, 2004
Posts: 18523
    
  40

Sidharth Khattri wrote:
2) With java LogSplitter "a " "\w"
Output:
><> <
2

Now, why does the second invocation return an empty token between -1 and 0 along with space following a in "a "
and the first invocation doesn't return an empty invocation between -1 and 0?


A regex of "\w" is a word character -- so a single word character is the delimiter (when using split). With a string of "a", the letter "a" is the delimiter -- yielding two components. The first is a zero length string, and the second is a single space.

Henry
Henry Wong
author
Sheriff

Joined: Sep 28, 2004
Posts: 18523
    
  40

Sidharth Khattri wrote:
3) Although with java LogSplitter "a" "\d"
Output is:
>a<
1
why does it return the token >a< even when there's no digit in the string? And it returned 0 in the first invocation?


A regex of "\d" is a digit character -- so a single numeric digit is the delimiter (when using split). With a string of "a", there are *no* matches, hence, no delimiters -- nothing to split. The result is the original string -- without any splitting done.

Henry
Sidharth Khattri
Ranch Hand

Joined: Sep 16, 2013
Posts: 121

Thanks Henry, I finally got it
 
I agree. Here's the link: http://aspose.com/file-tools
 
subject: Regular expressions and the split() method
 
Similar Threads
String.split method result
Tokenising using String.split().
Tokenizing with regex pattern. Little confused!
split() method
Java Calculator problems