File APIs for Java Developers
Manipulate DOC, XLS, PPT, PDF and many others from your application.
http://aspose.com/file-tools
The moose likes Programmer Certification (SCJP/OCPJP) and the fly likes Question regex Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login
JavaRanch » Java Forums » Certification » Programmer Certification (SCJP/OCPJP)
Bookmark "Question regex" Watch "Question regex" New topic
Author

Question regex

Abdul Mohsin
Ranch Hand

Joined: Apr 26, 2007
Posts: 111

Hi,

String str = "apples";
String s[] = str.split("");
for (String i:s)
System.out.println(">" + i + "<");

output:
>< // line 1 not clear
>a<
>p<
>p<
>l<
>e<
>s<


Everything is logical in the output but I am not able to understand the reason of line one , Please explain.

Regards,

Abdul Mohsin


Regards, Abdul Mohsin
Raghavan Muthu
Ranch Hand

Joined: Apr 20, 2006
Posts: 3344

Hi Abdul Moshin,

Thats because of the default limit to split, which is zero incase of not supplying anything for that parameter. Actually there are two versions of split method.

  • One that takes the {String regex} argument
  • One that takes {String regex,int limit} arguments.


  • If you invoke the first one, it inturn invokes the second one by passing "0" for the limit.

    The answer varies depends on the limit parameter you supply. Look at the following modified program. I just used 1.4 version for demonstration


    When running the output is as follows.


    splitting with default limit -> 0
    =================================
    >< //extra line because of the default limit "0"
    >a<
    >p<
    >p<
    >l<
    >e<
    >s<

    splitting with limit -> 1
    =================================
    >apples<

    splitting with limit -> -1
    =================================
    >< //extra line 1
    >a<
    >p<
    >p<
    >l<
    >e<
    >s<
    ><//extra line 2



    Look at the 1st and 3rd outputs. When supplying with default (inturn 0 is supplied) and with -1, the output varies. I think its because of the internal manipulation of the method implementation.

    Source of information i took from the JavaDoc is as below


    The limit parameter controls the number of times the pattern is applied and therefore affects the length of the resulting array.
    If the limit n is greater than zero then the pattern will be applied at most n - 1 times, the array's length will be no greater than n, and the array's last entry will contain all input beyond the last matched delimiter.

    If n is non-positive then the pattern will be applied as many times as possible and the array can have any length.

    If n is zero then the pattern will be applied as many times as possible, the array can have any length, and trailing empty strings will be discarded.


    Even you can look at the example given in the split(String,int) method


    Look at the last line above. When 0 is being used, there is one extra empty string in the middle comes.

    You can better dig more into the regex and Pattern APIs. you may get some additional info. If you get, please share it across.

    Any ranchers please help us out here...
    [ May 23, 2007: Message edited by: Raghavan Muthu ]

    Everything has got its own deadline including one's EGO!
    [CodeBarn] [Java Concepts-easily] [Corey's articles] [SCJP-SUN] [Servlet Examples] [Java Beginners FAQ] [Sun-Java Tutorials] [Java Coding Guidelines]
    dhwani mathur
    Ranch Hand

    Joined: May 08, 2007
    Posts: 621
    Thats perfectly right!!!
    see the below explanation.


    public String[] split(String regex, int limit)
    : Splits this string around matches of the given regular expression. An invocation of this method of the form str.split(regex, n) yields the same result as the expression Pattern.compile(regex).split(str, n)

    public String[] split(String regex)
    : Splits this string around matches of the given regular expression. This method works the same as if you invoked the two-argument split method with the given expression and a limit argument of zero. Trailing empty strings are not included in the resulting array.
    anil kumar
    Ranch Hand

    Joined: Feb 23, 2007
    Posts: 447
    Hi

    ----------------------------------------------------------------
    String str = "apples";
    String s[] = str.split("");
    for (String i:s)
    System.out.println(">" + i + "<");
    -------------------------------------------------------------------
    -------------
    |a|p|p|l|e|s||
    -------------
    0 1 2 3 4 5 6 7

    here before a there is an empty string now see her
    start end
    0 t0 0 ----->empty string
    1 to 1 ----->a and empty string
    here our pattern is empty string

    split method return the remainig part
    menas here a

    And one more thing.

    See above carefully after s there is a space.

    But that space is removed by split.Because this split method calls the two argument split method with second argument as 0.


    Thanks
    Anil Kumar

    [ May 23, 2007: Message edited by: anil kumar ]

    [ May 23, 2007: Message edited by: anil kumar ]
    [ May 23, 2007: Message edited by: anil kumar ]
    dhwani mathur
    Ranch Hand

    Joined: May 08, 2007
    Posts: 621
    i have tried the program.


    public class Str {

    String str = "apples";
    String s[] = str.split("");
    public void Test()
    {
    for (int i=0;i<s.length;i++)
    System.out.println(">" + i + "<");
    }

    public static void main(String[] args) {
    Str s=new Str();
    s.Test();

    }

    }


    So the value present in i is as folows
    output:

    >0<
    >1<
    >2<
    >3<
    >4<
    >5<
    >6<
    Abdul Mohsin
    Ranch Hand

    Joined: Apr 26, 2007
    Posts: 111

    Hi Anil ,

    Thanks for reply but my doubt is still not clear.

    posted Today 10:43 AM Profile for Abdul Mohsin Email Abdul Mohsin Send New Private Message Edit/Delete Post Reply With Quote Hi,

    String str = "apples";
    String s[] = str.split("");
    for (String i:s)
    System.out.println(">" + i + "<");

    output:
    >< // line 1 not clear
    >a<
    >p<
    >p<
    >l<
    >e<
    >s<


    I know there is a blank string before "a" of "apples" so when we are splitting with ""(blank string) then it should not be included in the output because there is only one blank string("") before "a" not two.
    Then why blank string is coming in the output?

    Regards,

    Abdul Mohsin
    anil kumar
    Ranch Hand

    Joined: Feb 23, 2007
    Posts: 447
    Hi

    Abdul Mohsin


    One thing let me clear here.Blank String mean " ".Observe there is a space between the quotes.I said empty string "".No space.

    If your doubt is not clear try to execute the below program

    String str = " apples";//line 1
    String s[] = str.split("");
    for (String i:s)
    System.out.println(">" + i + "<");

    see at line 1 i have given space(" ")

    Thanks

    Anil Kumar

    [ May 23, 2007: Message edited by: anil kumar ]
    [ May 23, 2007: Message edited by: anil kumar ]
    Abdul Mohsin
    Ranch Hand

    Joined: Apr 26, 2007
    Posts: 111

    ok let me post again with "empty string"
    Now my question is:


    String str = "apples";
    String s[] = str.split("");
    for (String i:s)
    System.out.println(">" + i + "<");

    output:
    >< // line 1 not clear
    >a<
    >p<
    >p<
    >l<
    >e<
    >s<



    I know there is a empty string before "a" of "apples" so when we are splitting with ""(empty string) then it should not be included in the output because there is only one empty string("") before "a" not two.
    Then why empty string is coming in the output?

    Regards,

    Abdul Mohsin
    Abdul Mohsin
    Ranch Hand

    Joined: Apr 26, 2007
    Posts: 111

    Hi,

    Next doubt..

    Guess the output of :

    code 1


    code 2



    Please explain I am loosing confidence .

    Regards,

    Abdul Mohsin
    anil kumar
    Ranch Hand

    Joined: Feb 23, 2007
    Posts: 447
    Hi

    Wait Man Do'nt be hurrry

    ------------------------------------------------------------------------
    output:
    >< // line 1 not clear
    >a<
    >p<
    >p<
    >l<
    >e<
    >s<


    Everything is logical in the output but I am not able to understand the reason of line one , Please explain.
    ---------------------------------------------------------------------------


    The above post posted by You

    your split method is this

    String str = "apples";
    String s[] = str.split("");

    See here for example consider every element in the string is here in the box
    means empty string in one box
    a in 2nd box
    and so on

    it finds the match in the first box and it prints the remaining part in the box except the matcher part.

    Watch those bold letters and you can understand the remaining.
    That' why you are getting
    >a<
    >p<
    >p<


    Thanks
    Anil Kumar

    [ May 23, 2007: Message edited by: anil kumar ]
    [ May 23, 2007: Message edited by: anil kumar ]
    Abdul Mohsin
    Ranch Hand

    Joined: Apr 26, 2007
    Posts: 111

    that's perfect then why I am getting >< before >a<

    Regards,

    Abdul Mohsin
    anil kumar
    Ranch Hand

    Joined: Feb 23, 2007
    Posts: 447
    see hereWhat is the first box contains?

    Nothing
    The remaining part of that box it is printing.
    If you are not able to understand see the above link



    Thanks

    Anil Kumar
    [ May 23, 2007: Message edited by: anil kumar ]
    Abdul Mohsin
    Ranch Hand

    Joined: Apr 26, 2007
    Posts: 111

    Thanks Anil for your help.javascript: x()
    Smile

    Please go through my secound doubt:

    Guess the output of :

    code 1

    code:


    String str = "aaa";
    String s[] = str.split("a");
    for (String i:s)
    System.out.println(">" + i + "<");



    code 2

    code:


    String str = "aaab";
    String s[] = str.split("a");
    for (String i:s)
    System.out.println(">" + i + "<");


    anil kumar
    Ranch Hand

    Joined: Feb 23, 2007
    Posts: 447
    Hi

    see my post in this page before to this name

    dhwani mathur

    REMEMBER THE TRAILING EMPTY STRING WILL BE REMOVED.Because we are calling the split method with one argument.This method will call the second split method with second argument as 0. see once java doc

    You will definetly understand


    Thanks

    Anil Kumar
    Abdul Mohsin
    Ranch Hand

    Joined: Apr 26, 2007
    Posts: 111

    Hi,

    ok let me surprise you

    String str="aaa";
    String s[] = str.split("a");
    System.out.println("length:"+s.length);
    for (String i:s)
    System.out.println(">" + i + "<");

    its output is :
    length:0

    **************

    when I just add 'b' in string

    String str="aaab";
    String s[] = str.split("a");
    System.out.println("length:"+s.length);
    for (String i:s)
    System.out.println(">" + i + "<");

    its output is:
    length:4
    ><
    ><
    ><
    >b<

    Please explain .


    Regards,
    Abdul Mohsin
    anil kumar
    Ranch Hand

    Joined: Feb 23, 2007
    Posts: 447
    Hey
    -------------------------------
    String str="aaa";
    String s[] = str.split("a");
    ---------------------------------
    |--------
    |a|a|a| |
    |--------
    here the matcher applies 3 times
    so we will get like this
    <>
    <>
    <>
    <>
    Now the split with(2 arguments the second argument is number of times it has to aplly here it is 0) ZERO argument will be applied
    so it finds the last empty string,it removes that and agian it sees that one more empty string it finds it removes that one and so on.Utill all the trailing empty string removed
    ------------------------------------
    String str="aaab";
    String s[] = str.split("a");
    --------------------------------------
    <>
    <>
    <>
    <b>
    <>


    Now the split with ZERO argument will be applied
    so it finds the last empty string, and again it tries to apply but this time it fails because it finds <b>

    so out put is
    <>
    <>
    <>
    <b>


    Thanks

    Anil Kumar

    [ May 23, 2007: Message edited by: anil kumar ]
    [ May 24, 2007: Message edited by: anil kumar ]
    Abdul Mohsin
    Ranch Hand

    Joined: Apr 26, 2007
    Posts: 111


    String str="aaa";
    String s[] = str.split("a");
    ---------------------------------
    |--------
    |a|a|a| |
    |--------
    here the matcher applies 3 times
    so we will get like this
    <>
    <>
    <>
    <>



    Hi,

    Thanks but for code 1 this is not the output.
    Output is nothing because length of array is 0

    Regards,

    Abdul Mohsin
    anil kumar
    Ranch Hand

    Joined: Feb 23, 2007
    Posts: 447
    see my post clearly
    For first the out put is nothing.

    Have you got that one or not?

    Thanks

    Anil Kumar
    [ May 24, 2007: Message edited by: anil kumar ]
    Abdul Mohsin
    Ranch Hand

    Joined: Apr 26, 2007
    Posts: 111



    Thanks a lot but still its not clear to me.

    Please provides me some good links for regex.

    Regards,

    Abdul Mohsin
    anil kumar
    Ranch Hand

    Joined: Feb 23, 2007
    Posts: 447
    Hey

    Try to write some programs and before executing first you think what is the output?
    Then check the program by executing .
    You will be understood.

    Thanks
    Anil Kumar
    Abdul Mohsin
    Ranch Hand

    Joined: Apr 26, 2007
    Posts: 111

    Hi Ranchers,

    Please help me provide some good links for regex.


    Regards,

    Abdul Mohsin
    Ulf Dittmer
    Marshal

    Joined: Mar 22, 2005
    Posts: 41800
        
      62
    Try the "I'm feeling lucky" option on Google for "regular expressions". That's a good site.


    Ping & DNS - my free Android networking tools app
    Abdul Mohsin
    Ranch Hand

    Joined: Apr 26, 2007
    Posts: 111

    Originally posted by anil kumar:
    Hey
    -------------------------------
    String str="aaa";
    String s[] = str.split("a");
    ---------------------------------
    |--------
    |a|a|a| |
    |--------
    here the matcher applies 3 times
    so we will get like this
    <>
    <>
    <>
    <>
    Now the split with ZERO argument will be applied
    so it finds the last empty string,it removes that and agian it sees that one more empty string it finds it removes that one and so on.Utill all the trailing empty string removed
    ------------------------------------
    String str="aaab";
    String s[] = str.split("a");
    --------------------------------------
    <>
    <>
    <>
    <b>
    <>


    Now the split with ZERO argument will be applied
    so it finds the last empty string, and again it tries to apply but this time it fails because it finds <b>

    so out put is
    <>
    <>
    <>
    <b>


    Thanks

    Anil Kumar

    [ May 23, 2007: Message edited by: anil kumar ]



    Thanks Anil
    yesterday I was not able to understand your reply but now I understood.

    Today I just gone through the java docs and now everything is clear.



    public String[] split(String regex)

    Splits this string around matches of the given regular expression.

    This method works as if by invoking the two-argument split method with the given expression and a limit argument of zero. Trailing empty strings are therefore not included in the resulting array.


    Thanks again

    Abdul Mohsin
    Chandra Bhatt
    Ranch Hand

    Joined: Feb 28, 2007
    Posts: 1707
    Hi Abdul,

    In first case when the split delimiter was "a" on String "aaa"; you got length of String 0.
    In my words its just cheating being done with us, I mean truncating the blank strings that
    are in between the a.



    There are four blank strings. But because there was no match apart from the delimiter pattern
    "a" in the string so all the blank matches truncated, so we got confused. We have one option
    to get the correct and intended result.

    public class SplitTest {

    public static void main(String... args) {
    String s ="aaa";
    String[] sarr = s.split("a",-1);

    for(String s1 : sarr) {
    System.out.println(">" + s1 + "<");
    }
    }
    }


    You will get
    ><
    ><
    ><
    ><

    These four are nothing but blank strings I told you just.


    In second case when you passed "a" as delimiter and "aaab" as String to be split. We get
    the result we thought. Three blank strings and one "b" that is the fourth token we get after
    the split. (three blank and one "b").


    Thanks,


    cmbhatt
    Abdul Mohsin
    Ranch Hand

    Joined: Apr 26, 2007
    Posts: 111

    Thanks chandra for great explaination
    Now I understood it completly
    Gautam Arora
    Greenhorn

    Joined: Jan 16, 2007
    Posts: 4
    Hello,
    Thanks to everyone on this thread. The regex ques in K&B had completely left me a lil bummed and lowered my confidence in this topic. Now I guess I'm back on track with some great info on split() and regex.

    Some queries w.r.t limit in the split(regex,limit):
    1. The difference between 0 and -1 is pretty clear, it will be the same output minus the last empty string in case of 0. Am I correct?

    2. I'm still not clear what exactly is meant by limit and its usage of non-negative and negative numbers. Any examples with explanations?

    Regards,
    Gautam Arora


    Regards,<br />Gautam Arora<br />100% Geek
    Chandra Bhatt
    Ranch Hand

    Joined: Feb 28, 2007
    Posts: 1707
    Hi Gautam,


    Straight from Java Doc:


    The limit parameter controls the number of times the pattern is applied and therefore affects the length of the resulting array. If the limit n is greater than zero then the pattern will be applied at most n - 1 times, the array's length will be no greater than n, and the array's last entry will contain all input beyond the last matched delimiter. If n is non-positive then the pattern will be applied as many times as possible and the array can have any length. If n is zero then the pattern will be applied as many times as possible, the array can have any length, and trailing empty strings will be discarded.

    The input "boo:and:foo", for example, yields the following results with these parameters:




    Parameters:
    input - The character sequence to be split
    limit - The result threshold, as described above
    Returns:
    The array of strings computed by splitting the input around matches of this pattern




    Thanks,
    [ May 24, 2007: Message edited by: Chandra Bhatt ]
    Gautam Arora
    Greenhorn

    Joined: Jan 16, 2007
    Posts: 4
    Thanks Chandra.
    I completely understand the limits concept. But have a few doubts in the examples.

    Input "boo:and:foo"

    But, after "b" why do we get just a single empty string "" and not 2 empty strings.

    As o is applied as the regex pattern, should we not get 2 empty strings:
    1. "" from start index = 1 to end index = 1
    2. "" from start index = 2 to end index = 2

    Just like in the end ":and:f" is followed by 2 empty strings.

    Also, if the above case would be true then the array length can only be 5, so our output would be
    { "b", "", "", ":and:f", "" }
    where the last empty string contains both the last 2 empty strings as per javadoc.
    [ May 24, 2007: Message edited by: Gautam Arora ]
    Chandra Bhatt
    Ranch Hand

    Joined: Feb 28, 2007
    Posts: 1707

    Input "boo:and:foo"

    Regex Limit Result
    o 5 { "b", "", ":and:f", "", "" } (as given)


    ->One blank string you get after b is from between o and o.
    ->Two blank strings you get after f are between o and o and after last o.

    Note:If you use LIMIT 4 you wont get that last blank strings,
    instead you get "o" the remaining portion; that is because already four
    times the pattern has been applied.


    Thanks,
    [ May 24, 2007: Message edited by: Chandra Bhatt ]
     
    I agree. Here's the link: http://aspose.com/file-tools
     
    subject: Question regex