File APIs for Java Developers
Manipulate DOC, XLS, PPT, PDF and many others from your application.
http://aspose.com/file-tools
The moose likes Java in General and the fly likes String Tokenizer Question Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login


Win a copy of Soft Skills this week in the Jobs Discussion forum!
JavaRanch » Java Forums » Java » Java in General
Bookmark "String Tokenizer Question" Watch "String Tokenizer Question" New topic
Author

String Tokenizer Question

Pt Skaar
Greenhorn

Joined: Dec 09, 2011
Posts: 3
Hello,

I have a program that is taking some LDAP data which looks something like this:

label=data,label=data...etc.

However, I've been asked to modify the code so that it will allow for this possibility:

label=data, data1, (unknown number),label=data...etc.

What's happening is that an unknown element exception is being thrown on data1. Is there an easy way around this? Especially since the possibility of multiple entries exists in the first segment.

Thanks,
Paul
Joanne Neal
Rancher

Joined: Aug 05, 2005
Posts: 3742
    
  16
Based on your thread title, my first piece of advice would be to use String.split. It's a lot more flexible than StringTokenizer.
Secondly you'll need to post some code before anyone can tell you what's wrong with it.


Joanne
Winston Gutkowski
Bartender

Joined: Mar 17, 2011
Posts: 8398
    
  23

Pt Skaar wrote:However, I've been asked to modify the code so that it will allow for this possibility:
label=data, data1, (unknown number),label=data...etc.

What's happening is that an unknown element exception is being thrown on data1. Is there an easy way around this? Especially since the possibility of multiple entries exists in the first segment.

What, so you can have:
label=data1, data2...,label=data
but not:
label=data,label=data1, data2...
?

It seems to me like you're having to clear up somebody else's badly thought-out fudge. Why couldn't the format simply be:
label=data1[,data2...];label=data[,data2...];... (semicolons for separating labels, commas for separating data elements)
it'd be a lot easier, and you could then have multiple data items anywhere you like.

Either way, the answer to your question is: Yes.

If the answer to my first question is 'yes', I'd probably try something like splitting the line initially on the first two "=" it finds (ie, line.split("=", 3)).
That will give you three portions:
1. The first label name.
2. The first label data, plus the 2nd label name, separated by commas.
3. The second label data, plus any other labels in the line, separated by commas.

It should be a fairly simple matter then to cobble those back together into a set of "label=data" elements.

But, as I said, the only reason you need to do this at all is because somebody else didn't think through the problem.

Winston
Pt Skaar
Greenhorn

Joined: Dec 09, 2011
Posts: 3
I misled you guys a bit so I'll try this. The names are changed to protect the innocent.

The LDAP data is actually being returned like this:

CN=John Smith\\, CLF\\, LUTCF,OU=Agency Sales East Region,OU=People,DC=corp

The existing code, which created a token based on the comma looked like this:

CN=John Smith\\

I added code to replace all the '\\,' values with ',' and that gets me LDAP data like this:

CN=John Smith, CLF, LUTCF,OU=Agency Sales East Region,OU=People,DC=corp

Which leads me to my error. If I'm on the wrong track with the initial change then please redirect me.

As for the not thought out process well, the situation causing the issue wasn't previously allowed as far as I know...and now it is hence the problem.
Winston Gutkowski
Bartender

Joined: Mar 17, 2011
Posts: 8398
    
  23

Pt Skaar wrote:The LDAP data is actually being returned like this:
CN=John Smith\\, CLF\\, LUTCF,OU=Agency Sales East Region,OU=People,DC=corp
...As for the not thought out process well, the situation causing the issue wasn't previously allowed as far as I know...and now it is hence the problem.

Right, so now you need to explain to us what it is you need to do with the above.
What is CLF? Is it an empty label, or part of the data for 'CN'?
And the same question for LUCTF.
And what is the significance of the '\\'? (I notice it doesn't follow LUTCF) Is it a data delimiter (indeed, is "\\, " the delimiter; I notice that the commas are followed by a space)?

it's difficult to suggest a solution until we know what you want.

Winston
Pt Skaar
Greenhorn

Joined: Dec 09, 2011
Posts: 3
I need to treat the CLF, LUTCF etc as part of the CN data. Otherwise I run the risk of getting a security exception I'd be looking for something like this:

CN=John Smith, CLF, LUTCF
OU=Agency Sales East Region
OU=People
DC=corp
Paul Clapham
Bartender

Joined: Oct 14, 2005
Posts: 18985
    
    8

Pt Skaar wrote:I added code to replace all the '\\,' values with ',' and that gets me LDAP data like this:

CN=John Smith, CLF, LUTCF,OU=Agency Sales East Region,OU=People,DC=corp


I would suggest replacing "\\," by something other than a comma. A + sign, for example. That would result in this:

CN=John Smith+ CLF+ LUTCF,OU=Agency Sales East Region,OU=People,DC=corp


Then you could split that on commas just like before. You'd have to have a second piece of code which split the CN data on the + sign, but I think that would be easier than trying to paste the "CN=" back onto the bits which had no name.

You would have to be careful to choose a suitable sub-delimiter, though. If the + sign could appear in a CN value then that wouldn't work. In that case you might have to look for something more obscure like the paragraph mark ¶ or the upside-down question mark ¿ or whatever.
Joanne Neal
Rancher

Joined: Aug 05, 2005
Posts: 3742
    
  16
I missed that this was an LDAP string. Commas are not the only special character that you will need to handle - LDAP has a number of them (including +). I think you would be better off looking at the classes available in JNDI (or any other LDAP handling API) and using those to handle your strings. They will probably make things a lot easier in the long run.
 
 
subject: String Tokenizer Question