aspose file tools*
The moose likes Programmer Certification (SCJP/OCPJP) and the fly likes Dot metacharacter Question Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login
JavaRanch » Java Forums » Certification » Programmer Certification (SCJP/OCPJP)
Bookmark "Dot metacharacter Question" Watch "Dot metacharacter Question" New topic
Author

Dot metacharacter Question

sharma ishu
Ranch Hand

Joined: Sep 10, 2012
Posts: 70
class C6{
public static void main(String[] a){
String s="abc de.f1 adf34 cat.dog";
System.out.println(s+"\n");
String[] t=s.split(a[0]);
for(String x:t)
System.out.println("<"+x+">");
//System.out.println(t[0]);
}
}
/*

C:\code\e5> javac C6.java


1. C:\code\e5>java C6 .
abc de.f1 adf34 cat.dog

2 C:\code\e5>java C6 \.
abc de.f1 adf34 cat.dog

<abc de>
<f1 adf34 cat>
<dog>


3. C:\code\e5>java C6 \\.
abc de.f1 adf34 cat.dog

<abc de.f1 adf34 cat.dog>


*/
Kindly explain why these three invocations behave this way. especially the 1st one.
Himai Minh
Ranch Hand

Joined: Jul 29, 2012
Posts: 758
In command prompt, the symbols:
1 . (a dot) means meta character
2. \. (a slash a dot) means a dot
3. \\. (a double slash and a dot) means a slash followed by a dot.

I think in command prompt, a slash is not an escape character. A slash is slash. But in a program, a slash means escape character.

Case 1, the string is splitted by a meta character,which is any character. If you are given "abcde." , the string is splitted by any character. When we split "ab", the result is a "" between "a" and "b".

Case 2, the string is splitted by a dot.

Case 3, the string is splitted by \\. , which does not exit in the given string. If you input ab\.c, I am sure the output is <ab> and <c>.

Let me know if I make mistake.
sharma ishu
Ranch Hand

Joined: Sep 10, 2012
Posts: 70
Himai Minh wrote:In command prompt, the symbols:
1 . (a dot) means meta character
2. \. (a slash a dot) means a dot
3. \\. (a double slash and a dot) means a slash followed by a dot.

I think in command prompt, a slash is not an escape character. A slash is slash. But in a program, a slash means escape character.

Case 1, the string is splitted by a meta character,which is any character. If you are given "abcde." , the string is splitted by any character. When we split "ab", the result is a "" between "a" and "b".

Case 2, the string is splitted by a dot.

Case 3, the string is splitted by \\. , which does not exit in the given string. If you input ab\.c, I am sure the output is <ab> and <c>.

Let me know if I make mistake.

I think 2nd and 3rd are fine. But in 1st case why doesn't it print empty strings between><?
Henry Wong
author
Sheriff

Joined: Sep 28, 2004
Posts: 18834
    
  40

Himai Minh wrote:3. \\. (a double slash and a dot) means a slash followed by a dot.



Almost. In the third case, the dot isn't escaped -- only the backslash is. So, it is trying to split on a backslash that is followed by any character.

Henry


Books: Java Threads, 3rd Edition, Jini in a Nutshell, and Java Gems (contributor)
Henry Wong
author
Sheriff

Joined: Sep 28, 2004
Posts: 18834
    
  40

ishusharma sharma wrote:I think 2nd and 3rd are fine. But in 1st case why doesn't it print empty strings between><?



This is an implementation detail ... to understand why, take a look at the javadoc for the java.util.regex.Pattern class, specifically for the split() method.

split
public String[] split(CharSequence input)

Splits the given input sequence around matches of this pattern.

This method works as if by invoking the two-argument split method with the given input sequence and a limit argument of zero. Trailing empty strings are therefore not included in the resulting array.


And since splitting on a regex dot (ie. any character) should yield nothing but zero length strings, the split() method call should return an empty array as the result.

Henry
 
I agree. Here's the link: http://aspose.com/file-tools
 
subject: Dot metacharacter Question