This week's book giveaway is in the Cloud/Virtualizaton forum.
We're giving away four copies of Mesos in Action and have Roger Ignazio on-line!
See this thread for details.
Win a copy of Mesos in Action this week in the Cloud/Virtualizaton forum!
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic

Is using Regex pattern ideal for this situtation

 
manish ahuja
Ranch Hand
Posts: 312
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi All,

I have a situation where in I get data in format as below
I get this data in the form of a String array

/Book/Subject/Fiction
/Book/Subject/NonFiction
/Periodical/Type/Magazine
/Periodical/Type/Newspaper

& many such more. In above we have like 3 level hierarchial structure
(/Book/Subject/Fiction).
In some cases I may get a 4 or 5 level structure.

What I need to do is create the xml representation for the same.

i.e. create the first token as a top level xml element
all the subsequent ones as child level xml elements underneath the above
& the last token as the value for the last level xml element

So say in the above case what we need to do is

<book>
<subject>Fiction</subject>
</book>

<periodical>
<type>Magazine</type>
</periodical>

I am planning to implement this via the Regex pattern. I don't have much
badwidth on the same.

Do post your thoughts whether I am heading in right direction.
I would appreciate if someone can post few codepieces through which can help me acheiving the same.



Thanks In Advance,

Manish
 
Henry Wong
author
Marshal
Pie
Posts: 21117
78
C++ Chrome Eclipse IDE Firefox Browser Java jQuery Linux VI Editor Windows
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Do post your thoughts whether I am heading in right direction.


Sure... why not? Heck, if the path line is stored in a string, you can do it with a single replaceFirst() method call.

Henry
 
manish ahuja
Ranch Hand
Posts: 312
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi Henry,

Here is what i am trying to do
Since I am getting the data as list of item values in a String array.

My code is something like this

Pattern pat = Pattern.compile("/");
String strs[] = pat.split("/Book/subject/Fiction");

for (int i = 0; i < strs.length; i++)
System.out.println("Next token: " + strs[i]);

The only catch I have currently is generating the dynamic xml element tag names based on the value of these token.
i.e. first token = top level xml element
subsequent tokens = child xml elements
last token = value in the last child xml element

You mentioned about replaceFirst. I tried looking into the same.
Can you ellaborate a bit on the same.

Thanks,
Manish
 
Henry Wong
author
Marshal
Pie
Posts: 21117
78
C++ Chrome Eclipse IDE Firefox Browser Java jQuery Linux VI Editor Windows
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
The only catch I have currently is generating the dynamic xml element tag names based on the value of these token.


What's the catch? You already parsed the tokens into members of the string array -- building an XML string based on the tokens is relatively easy.


As for replaceFirst(), I was thinking about a fixed number of tokens -- your question had only three. In that regard, what you are doing should work better, but as a FYI, here is my solution (for only 3 tokens)...



Henry
 
manish ahuja
Ranch Hand
Posts: 312
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi Henry,

That was a neat way to acheive the same. But in my case that fails as I don't have a fixed number of tokens.

Can you give us some more hints.


Regards,

Manish
 
Ilja Preuss
author
Sheriff
Posts: 14112
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
For a variable number of levels, regular expressions don't work very well - if at all.

I would probably parse each line using String.split, and then build the XML tree using DOM (or preferably Dom4J).
 
Henry Wong
author
Marshal
Pie
Posts: 21117
78
C++ Chrome Eclipse IDE Firefox Browser Java jQuery Linux VI Editor Windows
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Originally posted by manish ahuja:

That was a neat way to acheive the same. But in my case that fails as I don't have a fixed number of tokens.

Can you give us some more hints.


What more hints do you need? You acknowledged that you got all the tokens via the split() method. You just have to build the XML string -- which can be accomplished with a loop, writing to a string buffer.

Henry
 
manish ahuja
Ranch Hand
Posts: 312
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi Henry,

Yeah thats what we are doing.
From your earlier post we thought you are hinting at some advanced Regex pattern functions.

Sorry about that the additional posts


Thanks,
 
Henry Wong
author
Marshal
Pie
Posts: 21117
78
C++ Chrome Eclipse IDE Firefox Browser Java jQuery Linux VI Editor Windows
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Originally posted by manish ahuja:
Yeah thats what we are doing.
From your earlier post we thought you are hinting at some advanced Regex pattern functions.


Well, I guess you could use a combination of find(), appendReplacement(), and appendTail(), to do the regex search and xml generation in a single pass, instead of having a separate passes for split() and the generation of the xml, but...

Considering that you have only 3 to 5 tokens in the path, it shouldn't make any difference. What you are doing should work well.

Henry
 
  • Post Reply
  • Bookmark Topic Watch Topic
  • New Topic