• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Jeanne Boyarsky
  • Ron McLeod
  • Paul Clapham
  • Liutauras Vilda
Sheriffs:
  • paul wheaton
  • Rob Spoor
  • Devaka Cooray
Saloon Keepers:
  • Stephan van Hulst
  • Tim Holloway
  • Carey Brown
  • Frits Walraven
  • Tim Moores
Bartenders:
  • Mikalai Zaikin

merge XMl documents using schema to determine sequencing

 
Greenhorn
Posts: 27
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hi,

I am trying to merge 2 XML documents.

I have written some crude code which does this, but it does not use the schema and when I validate the new XML against the schema, I find that some of my new nodes have the correct parent but are out of sequence.

I would like to use the schema to guide the merge; to determine where nodes should be inserted.

The code I am writing to do this is very cumbersome and fragile. Is there an established way of achieving this? XSLT?

I have considered using JAXB to generate a model and then perform the merge on the model (using reflection to set properties) and then marshalling to the XML. But this doesn't seem very nice either.

Any advice much appreciated!

Kato
 
Ranch Hand
Posts: 734
7
  • Likes 1
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
If you mean looking at two schemas and able to come up with a definite plan to do the merge of one document into another, write up an xslt document as a template reflecting that plan, can surely do the job quite effectively. If you mean let the machine parses the two schemas and let it come up with a definite plan for the merger, that alone would be a formidable task.
 
kato Kwong
Greenhorn
Posts: 27
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Thanks g tsuji.

In fact, both XML files share 1 schema, so perhaps this makes things a little easier.

But what I am doing (which seems very bad) is

1) iterating through each node in XML doc1
2) look up the required location of the node in XML doc2 using XPath and the schema
3) find the location in XML doc2 and insert node

And there seem to be a lot of loops and tests and it is very messy.

So do you think XSLT is the best tool to use here? I want to guarentee that the merged document will always validate against the schema.

Cheers,
Kato
 
g tsuji
Ranch Hand
Posts: 734
7
  • Likes 1
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
>So do you think XSLT is the best tool to use here?
I always distrust the claim of superlative. But I think it is quite a effective tool for doing it. The validity of the final document can only be guranteed by the logic built into the xsl document. That part of it thereby depends on the perspicacity of the author of the xsl document. Other than that, there is no gurantee. The reason is that if that is an overwhelmingly complicated task to code.

I can cook up a demo and you'll see it is not that trivial as one might think in the detail.

[1] Suppose the common schema look like this.

[2] The two xml documents look like this, for instance.


Watch carefully the possible missing tags.
[3] The xsl document can look like this using the lowest common denominator of xslt 1.0.

[3.1] I make more provisions in the xsl than is necessary that is why it looks more than minimum necessity. The elements a, b and c can be complicated complexType and it will perform the same. There can be other elements other than a, b and c inside the root in the container.xml which will be preserved (that's why there is an identity transformation at the start of it....) In any case, it shows the already not very naive sequencing of a, b and c because any of them can be absent. I leave a couple xsl:if blocks there repetitively so as to highlight the implementation of the sequence minOccurs=0 and maxOccurs="unbounded". You can try to put the logic into a couple of named templates.

[3.2] Imagine more complicated situation and asking xsl document to make sure the resultant output be validated as well, it is a very complicated task.

[3.3] In the xsl, although already fairly elaborated, it assumes the containing.xml contains at least one a or b or c element. It can be further elaborate to accommodate the case where there is none of them. I leave it to you as an exercise.

[4] Late Edit note: Upon re-reading what I posted, I found a loop-hole in certain xsl:if where it requires double counting condition. I re-edit that part. This is to record that edition to avoid any confusion.
 
kato Kwong
Greenhorn
Posts: 27
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Thanks very much g tsuji.

That is a most comprehensive response! No need to distrust the superlative this time

I will give this a go and see what comes of it.
 
Consider Paul's rocket mass heater.
reply
    Bookmark Topic Watch Topic
  • New Topic