Hi, I am writing a program for a client which involves tokenizing strings based on a delimiter, usually "*" but it could be anything. An example string would be GH*ASD*FD**D2*f**123*45;
The problem I am facing is I would need to extract every element between the "*"s and in the case where there are two "*" as above (FD**D2 and f**123) I would need to be able to determine that there is an empty element. I thought I could do it by replacing the ** with *!* or sthg to that effect. But using String.replace I am getting Pattern Error.
My code is : String checkit = "GS*sf**a"; String elementDelimiter = ""+checkit.charAt(2); checkit = checkit.replaceAll(elementDelimiter+elementDelimiter,elementDelimiter+"!"+elementDelimiter); System.out.println(checkit);
I am getting the following error: java.util.regex.PatternSyntaxException: Dangling meta character '*' near index 0 ** ^ at java.util.regex.Pattern.error(Pattern.java:1542) at java.util.regex.Pattern.sequence(Pattern.java:1659) at java.util.regex.Pattern.expr(Pattern.java:1559) at java.util.regex.Pattern.compile(Pattern.java:1293) at java.util.regex.Pattern.<init>(Pattern.java:1049) at java.util.regex.Pattern.compile(Pattern.java:793) at java.lang.String.replaceAll(String.java:2038) at com.h2h.core.ComplianceCheckerBean.main(ComplianceCheckerBean.java:23)
PLEASE HELP!!
Thanks Karthik
Campbell Ritchie
Sheriff
Joined: Oct 13, 2005
Posts: 32689
4
posted
0
The * in regular expressions doesn't mean *, it means any number of repetitions including 0.
The * character along with + ? - . and quite a lot of others is called a meta-character, so you have to escape it by passing \* instead. If you want \* to appear in a Java String literal you have to escape the \ too, so you would write "\\*".
There's no need to pre-process the string anyway. Instead of StringTokenizer, use split() to tokenize it. See the API doc for the split() method for an explanation of the -1 parameter.