GeeCON Prague 2014*
The moose likes Beginning Java and the fly likes PatternSyntaxException: Unclosed group Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login


JavaRanch » Java Forums » Java » Beginning Java
Bookmark "PatternSyntaxException: Unclosed group" Watch "PatternSyntaxException: Unclosed group" New topic
Author

PatternSyntaxException: Unclosed group

Arun Giridhar
Ranch Hand

Joined: Mar 10, 2012
Posts: 147

Hello,

I'm trying to split the String based on my pattern , string is "\uDC00\uD800" on split i like to get two String array String[0]---> \uDC00 and String2--->\uD800

Code


Exception :
Exception in thread "main" java.util.regex.PatternSyntaxException: Unclosed group near index 19
^(\)[a-zA-Z0-9]*$1$


Is this a correct way of doing ?

2)
Result
00CDu\008Du\
I'm not getting what API Doc means to say

Thank You


hate Professionalism
Jeff Verdegan
Bartender

Joined: Jan 03, 2004
Posts: 6109
    
    6

Arun Giridhar wrote:
Exception :
Exception in thread "main" java.util.regex.PatternSyntaxException: Unclosed group near index 19
^(\)[a-zA-Z0-9]*$1$


The '(' is not escaped, so it is the start of a group. You never close that group with a matching, unescaped ')'

The "\\)" in your string literal is becomes a literal ')' character, because of the escaping backslash.

2)
Result
00CDu\008Du\


Try this.


Mansukhdeep Thind
Ranch Hand

Joined: Jul 27, 2010
Posts: 1157

Why does the statement

print 2 "??" on the console Jeff?

Moreover, I think simply using ("\\\\") should tokenize the literal. Try it out Arun. Does the out put confirm to your expectation?


~ Mansukh
Jeff Verdegan
Bartender

Joined: Jan 03, 2004
Posts: 6109
    
    6

Mansukhdeep Thind wrote:Why does the statement

print 2 "??" on the console Jeff?


Because your console's character set + font combination doesn't support those characters (or that 2-byte character, if that's what it is).

Moreover, I think simply using ("\\\\") should tokenize the literal.


What is that even supposed to mean? It seems you don't understand the problem. He wants a unicode escape sequence, but he's getting a literal instead. Putting a literal backslash into the regex won't help.
Mansukhdeep Thind
Ranch Hand

Joined: Jul 27, 2010
Posts: 1157

Ohh. OK. My mistake. He wants the ouput to be tokenized as \uDC00 \uD800 and so on. Correct?
Jeff Verdegan
Bartender

Joined: Jan 03, 2004
Posts: 6109
    
    6

Mansukhdeep Thind wrote:Ohh. OK. My mistake. He wants the ouput to be tokenized as \uDC00 \uD800 and so on. Correct?


He has two characters (or perhaps two pieces of one character?), \uDC00 and \uD800. I don't know what he's trying to do with the split() call, but for #2, he's trying to reverse the order of those two characters, so that he ends up with \uD800 followed by \uDC00.
Arun Giridhar
Ranch Hand

Joined: Mar 10, 2012
Posts: 147

Jeff Verdegan wrote:He has two characters (or perhaps two pieces of one character?), \uDC00 and \uD800. I don't know what he's trying to do with the split() call

You caught me on hands cold.I'm trying to split it so i get two unicode character.Am i missing anything ?

Mansukhdeep Thind
Ranch Hand

Joined: Jul 27, 2010
Posts: 1157

Frankly, I am unable to get a firm grip on what you are trying to achieve.
Winston Gutkowski
Bartender

Joined: Mar 17, 2011
Posts: 7892
    
  21

Arun Giridhar wrote:You caught me on hands cold.I'm trying to split it so i get two unicode character.Am i missing anything ?

Yes. Java Strings are already Unicode (as are chars), so the simplest way to split any Java String into Unicode characters is with String.toCharArray().

However, I don't think that's what you're asking. My guess is that you want to do something if you find "\uDC00\uD800" in your String.

What we still don't know yet is: what you want to do. Do you just want to swap those two characters around?

Winston

Isn't it funny how there's always time and money enough to do it WRONG?
Articles by Winston can be found here
Mansukhdeep Thind
Ranch Hand

Joined: Jul 27, 2010
Posts: 1157

What we still don't know yet is: what you want to do. Do you just want to swap those two characters around?
Well, he even got you confused.

Arun Giridhar
Ranch Hand

Joined: Mar 10, 2012
Posts: 147

Winston Gutkowski wrote:
The simplest way to split any Java String into Unicode characters is with String.toCharArray().

Winston


Yes your right but i'm trying to achieve it using Regex.
Arun Giridhar
Ranch Hand

Joined: Mar 10, 2012
Posts: 147

Jeff Verdegan wrote:
Because your console's character set + font combination doesn't support those characters (or that 2-byte character, if that's what it is).


How to make character set + font combination get supported by my console ?
Winston Gutkowski
Bartender

Joined: Mar 17, 2011
Posts: 7892
    
  21

Arun Giridhar wrote:Yes your right but i'm trying to achieve it using Regex.

OK, so what's all this "\uDC00\uD800" nonsense then?

Have you tried String.split("")? I have to admit, I never have, but it would be my first port of call.

Winston
Arun Giridhar
Ranch Hand

Joined: Mar 10, 2012
Posts: 147

Winston Gutkowski wrote:
OK, so what's all this "\uDC00\uD800" nonsense then?

Winston

Hmmm ..Quite Rude !
Winston Gutkowski wrote:
Have you tried String.split("")? I have to admit, I never have, but it would be my first port of call.

Winston


Yes , thought of doing same thing in different way using Regex.
Jeff Verdegan
Bartender

Joined: Jan 03, 2004
Posts: 6109
    
    6

Arun Giridhar wrote:
Winston Gutkowski wrote:
OK, so what's all this "\uDC00\uD800" nonsense then?

Winston

Hmmm ..Quite Rude !
Winston Gutkowski wrote:
Have you tried String.split("")? I have to admit, I never have, but it would be my first port of call.

Winston


Yes , thought of doing same thing in different way using Regex.


Winston's suggestion DOES use regex. It just happens to be a trivially simple regex. Why do you want to make it more complicated?

Make sure you understand this: When you have

in your code, the String object has 2 characters, not 8 or 10 or 12, and there are no "\u" in it.

If I write

it's exactly as if I had written
.
That transformation happens at the beginning of compilation, before anything else. You can even use unicode escapes for operators and semicolons and keywords.
Arun Giridhar
Ranch Hand

Joined: Mar 10, 2012
Posts: 147

Winston Gutkowski wrote:
Arun Giridhar wrote:Yes your right but i'm trying to achieve it using Regex.

Have you tried String.split("")?

Winston


I didn't like this way because the first String it return is an empty string .So i thought of finding another way to do it using Regex
Arun Giridhar
Ranch Hand

Joined: Mar 10, 2012
Posts: 147

Jeff Verdegan wrote:
That transformation happens at the beginning of compilation, before anything else. You can even use unicode escapes for operators and semicolons and keywords.


Yes your right it's happening in the Compilation time just checked and i need change my regex. Thank you for pointing out
Winston Gutkowski
Bartender

Joined: Mar 17, 2011
Posts: 7892
    
  21

Arun Giridhar wrote:I didn't like this way because the first String it return is an empty string .So i thought of finding another way to do it using Regex

Why?

Arun,

You have a problem; you've been given several possible solutions and guidance (sorry about the "nonsense" - that was rude - I'm still wondering why you felt those particular characters were important though), but you don't seem willing to try anything you've been told because there's something "wrong" with it.

Well, at the risk of sounding rude again, you don't know what you do want. And until you do, I doubt there's much we can do to help.

Winston
 
GeeCON Prague 2014
 
subject: PatternSyntaxException: Unclosed group