• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Jeanne Boyarsky
  • Ron McLeod
  • Paul Clapham
  • Liutauras Vilda
Sheriffs:
  • paul wheaton
  • Rob Spoor
  • Devaka Cooray
Saloon Keepers:
  • Stephan van Hulst
  • Tim Holloway
  • Carey Brown
  • Frits Walraven
  • Tim Moores
Bartenders:
  • Mikalai Zaikin

Regular expressions and the split() method

 
Ranch Hand
Posts: 125
Scala Java Linux
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
I thought I could ask here instead of making a new thread. I'll make a new thread if I get no response :/
I still don't understand the concept.

I wrote this little program:


Here are few of the outputs from the following command line invocations:
1) With java LogSplitter "a" "\w"
Output:

0

2) With java LogSplitter "a " "\w"
Output:
><> <
2

Now, why does the second invocation return an empty token between -1 and 0 along with space following a in "a "
and the first invocation doesn't return an empty invocation between -1 and 0?

3) Although with java LogSplitter "a" "\d"
Output is:
>a<
1
why does it return the token >a< even when there's no digit in the string? And it returned 0 in the first invocation?

4) With java LogSplitter "" "\w"
Output:
><
1
why does it return an empty string when there's nothing in the string?

5) With java LogSplitter " a" "\s"
Output:
><><>a<
3
What's up when using "\s"?

WHAT IS THE LOGIC BEHIND SPLIT?
 
author
Posts: 23951
142
jQuery Eclipse IDE Firefox Browser VI Editor C++ Chrome Java Linux Windows
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Sidharth Khattri wrote:I thought I could ask here instead of making a new thread. I'll make a new thread if I get no response :/
I still don't understand the concept.



Yea, let's move this to a new topic instead of confusing the other topic. Also, you no longer have to wait for "no response".

Henry
 
Sidharth Khattri
Ranch Hand
Posts: 125
Scala Java Linux
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Henry Wong wrote:

Sidharth Khattri wrote:I thought I could ask here instead of making a new thread. I'll make a new thread if I get no response :/
I still don't understand the concept.



Yea, let's move this to a new topic instead of confusing the other topic. Also, you no longer have to wait for "no response".

Henry



Thank you for moving this to a new thread. I never wanted to confuse anyone though.
Would love to get a response
 
Henry Wong
author
Posts: 23951
142
jQuery Eclipse IDE Firefox Browser VI Editor C++ Chrome Java Linux Windows
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Sidharth Khattri wrote:I thought I could ask here instead of making a new thread. I'll make a new thread if I get no response :/
I still don't understand the concept.

I wrote this little program:


Here are few of the outputs from the following command line invocations:
1) With java LogSplitter "a" "\w"
Output:

0



A regex of "\w" is a word character -- so a single word character is the delimiter (when using split). With a string of "a", the letter "a" is the delimiter -- yielding two components which are both zero length strings.

However, with the version of split(), that takes a single string (the delimiter), all trailing zero-length parts are removed. This means that there are no components after the split.

Henry
 
Henry Wong
author
Posts: 23951
142
jQuery Eclipse IDE Firefox Browser VI Editor C++ Chrome Java Linux Windows
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Sidharth Khattri wrote:
2) With java LogSplitter "a " "\w"
Output:
><> <
2

Now, why does the second invocation return an empty token between -1 and 0 along with space following a in "a "
and the first invocation doesn't return an empty invocation between -1 and 0?



A regex of "\w" is a word character -- so a single word character is the delimiter (when using split). With a string of "a", the letter "a" is the delimiter -- yielding two components. The first is a zero length string, and the second is a single space.

Henry
 
Henry Wong
author
Posts: 23951
142
jQuery Eclipse IDE Firefox Browser VI Editor C++ Chrome Java Linux Windows
  • Likes 1
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Sidharth Khattri wrote:
3) Although with java LogSplitter "a" "\d"
Output is:
>a<
1
why does it return the token >a< even when there's no digit in the string? And it returned 0 in the first invocation?



A regex of "\d" is a digit character -- so a single numeric digit is the delimiter (when using split). With a string of "a", there are *no* matches, hence, no delimiters -- nothing to split. The result is the original string -- without any splitting done.

Henry
 
Sidharth Khattri
Ranch Hand
Posts: 125
Scala Java Linux
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Thanks Henry, I finally got it
reply
    Bookmark Topic Watch Topic
  • New Topic