| Author |
Dot metacharacter Question
|
sharma ishu
Ranch Hand
Joined: Sep 10, 2012
Posts: 70
|
|
class C6{
public static void main(String[] a){
String s="abc de.f1 adf34 cat.dog";
System.out.println(s+"\n");
String[] t=s.split(a[0]);
for(String x:t)
System.out.println("<"+x+">");
//System.out.println(t[0]);
}
}
/*
C:\code\e5> javac C6.java
1. C:\code\e5>java C6 .
abc de.f1 adf34 cat.dog
2 C:\code\e5>java C6 \.
abc de.f1 adf34 cat.dog
<abc de>
<f1 adf34 cat>
<dog>
3. C:\code\e5>java C6 \\.
abc de.f1 adf34 cat.dog
<abc de.f1 adf34 cat.dog>
*/
Kindly explain why these three invocations behave this way. especially the 1st one.
|
 |
Himai Minh
Ranch Hand
Joined: Jul 29, 2012
Posts: 287
|
|
In command prompt, the symbols:
1 . (a dot) means meta character
2. \. (a slash a dot) means a dot
3. \\. (a double slash and a dot) means a slash followed by a dot.
I think in command prompt, a slash is not an escape character. A slash is slash. But in a program, a slash means escape character.
Case 1, the string is splitted by a meta character,which is any character. If you are given "abcde." , the string is splitted by any character. When we split "ab", the result is a "" between "a" and "b".
Case 2, the string is splitted by a dot.
Case 3, the string is splitted by \\. , which does not exit in the given string. If you input ab\.c, I am sure the output is <ab> and <c>.
Let me know if I make mistake.
|
 |
sharma ishu
Ranch Hand
Joined: Sep 10, 2012
Posts: 70
|
|
Himai Minh wrote:In command prompt, the symbols:
1 . (a dot) means meta character
2. \. (a slash a dot) means a dot
3. \\. (a double slash and a dot) means a slash followed by a dot.
I think in command prompt, a slash is not an escape character. A slash is slash. But in a program, a slash means escape character.
Case 1, the string is splitted by a meta character,which is any character. If you are given "abcde." , the string is splitted by any character. When we split "ab", the result is a "" between "a" and "b".
Case 2, the string is splitted by a dot.
Case 3, the string is splitted by \\. , which does not exit in the given string. If you input ab\.c, I am sure the output is <ab> and <c>.
Let me know if I make mistake.
I think 2nd and 3rd are fine. But in 1st case why doesn't it print empty strings between><?
|
 |
Henry Wong
author
Sheriff
Joined: Sep 28, 2004
Posts: 16681
|
|
Himai Minh wrote:3. \\. (a double slash and a dot) means a slash followed by a dot.
Almost. In the third case, the dot isn't escaped -- only the backslash is. So, it is trying to split on a backslash that is followed by any character.
Henry
|
Books: Java Threads, 3rd Edition, Jini in a Nutshell, and Java Gems (contributor)
|
 |
Henry Wong
author
Sheriff
Joined: Sep 28, 2004
Posts: 16681
|
|
ishusharma sharma wrote:I think 2nd and 3rd are fine. But in 1st case why doesn't it print empty strings between><?
This is an implementation detail ... to understand why, take a look at the javadoc for the java.util.regex.Pattern class, specifically for the split() method.
split
public String[] split( CharSequence input)
Splits the given input sequence around matches of this pattern.
This method works as if by invoking the two-argument split method with the given input sequence and a limit argument of zero. Trailing empty strings are therefore not included in the resulting array.
And since splitting on a regex dot (ie. any character) should yield nothing but zero length strings, the split() method call should return an empty array as the result.
Henry
|
 |
 |
|
|
subject: Dot metacharacter Question
|
|
|