• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Jeanne Boyarsky
  • Ron McLeod
  • Paul Clapham
  • Liutauras Vilda
Sheriffs:
  • paul wheaton
  • Rob Spoor
  • Devaka Cooray
Saloon Keepers:
  • Stephan van Hulst
  • Tim Holloway
  • Carey Brown
  • Frits Walraven
  • Tim Moores
Bartenders:
  • Mikalai Zaikin

Xerces-2 Parsing to DOM and default attributes question

 
Ranch Hand
Posts: 66
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
I am trying to use Xerces implementation of JAXP to load a hibernate mapping file into the DOM and then write out the XML using Xalan. It works pretty well but in the final XML, all of the attributes with a default value defined in the DTD are appearing. The attributes are also being written out in alphabetical order. I don't want either of these behaviors. I tried playing around with setting a few different parser features but I couldn't figure out how to stop it.

Is there any way I can stop Xerces from adding the default attributes and also leave them in the order they were read from the original XML?

Thanks.
 
Marshal
Posts: 28193
95
Eclipse IDE Firefox Browser MySQL Database
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
At least for the order of attributes, the XML Recommendation specifically says that is not significant. So there is no point in asking for any particular order.

As for the attributes with default values, if the output document doesn't have the same DTD included in it then you would want those attributes to appear, wouldn't you? And if it does have the DTD included, then those attributes may be redundant but they aren't incorrect and they don't change the meaning of the document in any way.

So it looks to me like you have two non-problems here. Unless there's some operational reason that those things are causing trouble?
 
Mark Williams
Ranch Hand
Posts: 66
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
The only issue I have is that I am trying to do an automated change on several thousand hibernate mappings. Where I work, we let our customers have source code and they often make their own changes. All of the unecessary changes introduced by the transformation would make it harder for customers to take an upgrade and bring their modifications forward.

Other than that, I agree that it's a non-issue. Thanks for the response.
 
Ranch Hand
Posts: 2187
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Xerces is a low-level XML parser. It does not produce any XML-based data. Xerces does not "add" any attirbutes and it does not alphabetize them either.

Your concern lies in whatever is creating the XML-based data, e.g. DOM implementation, Xalan implementation.
 
Mark Williams
Ranch Hand
Posts: 66
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Jimmy Clark wrote:Xerces is a low-level XML parser. It does not produce any XML-based data. Xerces does not "add" any attirbutes and it does not alphabetize them either.

Your concern lies in whatever is creating the XML-based data, e.g. DOM implementation, Xalan implementation.



So, I think I am confused about which implementation is responsible for which portion of JAXP. Where is the line that marks the end of Xerces's responsibility and the beginning of Xalan's responsibility?

I assumed Xalan was parsing the XML and building the DOM. Then i thought Xalan was transforming the DOM into XML. But I guess going from XML to DOM is a transformation and that would seem to also fall in Xalan's court. I am confused...
 
Mark Williams
Ranch Hand
Posts: 66
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Well it looks like the Xalan Design link from the Xalan website clears up some of the confusion. I'll have to look at it in more detail after getting some rest. I don't understand the process completely at this point but I think the information I need is all there.
 
Jimmy Clark
Ranch Hand
Posts: 2187
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Xerces and Xalan are different applications.

Xerces is a "parser." In practice, an application is created which receives information about an XML-based document from Xerces.


Where is the line that marks the end of Xerces's responsibility and the beginning of Xalan's responsibility?



Xerces is simply an XML-based parser. It reads an XML-based document, makes sure it follows all the rules and passes information about the document to an application.

Programmers create applications that receive this information from Xerces. Xalan is an example of one of these applications. An XSLT Engine is another example.

Your application is not interacting with Xerces directly. If it was, you would know a bit more about how it works. Your application is using Xalan which is DOM-based. It is reading an XML document, creating a DOM model of the document and then creating another XML document.




 
Mark Williams
Ranch Hand
Posts: 66
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Jimmy, I follow what you are saying but it doesn't appear that my application's behavior matches what you describe.

When I get a new instance of DocumentBuilderFactory - an instance of org.apache.xerces.jaxp.DocumentBuilderFactoryImpl is returned.
The document object is an instance of org.apache.xerces.dom.DeferredDocumentImpl.

When I inspect the DOM in debug using eclipse immediately after calling DocumentBuilder.parse() it appears as if all of the default attributes are in the DOM representation of the source at that time.

Does that make any sense?

 
Jimmy Clark
Ranch Hand
Posts: 2187
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
I see. See if you can modify the application to not use the DTD. Without the DTD, there is no way to know what the default attributes would be.

If this works, then you could incorprate a validation step against the DTD prior to the operation.
 
Mark Williams
Ranch Hand
Posts: 66
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Jimmy, turning off the load of the external DTD did the trick. Thanks.

As for the order of the attributes. I believe I am going to have to take another approach.

From what I read, attributes are stored in org.apache.xerces.dom.NamedNodeMapImpl. I can't prove it but judging by the logic used to place attribute nodes into NamedNodeMapImpl's Vector based internal storage, it appears to me that the attributes will always be stored in alphabetical order.
 
Paul Clapham
Marshal
Posts: 28193
95
Eclipse IDE Firefox Browser MySQL Database
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
You could certainly produce a forked version of Xalan which preserved the order of the attributes, but that seems like overkill to me. Are you really proposing to do that?

Actually when I say "certainly" that's an exaggeration. Xalan is getting the attributes from the DOM, so you would have to persuade the DOM to preserve the order of the attributes. Which I don't believe it does. You might have to write your own DOM implementation in that case, which might be harder unless you could find an open-source implementation to start from.
 
Mark Williams
Ranch Hand
Posts: 66
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Whoa don't get carried away! I definitely didn't propose a Xerces fork. I was thinking more along the lines using a SAX to handle the elements I am interested in while passing the ones I don't care about straight to the output.
 
Paul Clapham
Marshal
Posts: 28193
95
Eclipse IDE Firefox Browser MySQL Database
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Okay... but SAX doesn't preserve the order of the attributes either. The startElement method passes you something which amounts to a Map of the attributes attached to the element.
 
Mark Williams
Ranch Hand
Posts: 66
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator

Paul Clapham wrote:Okay... but SAX doesn't preserve the order of the attributes either. The startElement method passes you something which amounts to a Map of the attributes attached to the element.



Yikes, glad I didn't spend any time on reworking the approach to use SAX then! I guess I'll be hacking something together to suit my needs then. Thanks for the advice.
 
Paul Clapham
Marshal
Posts: 28193
95
Eclipse IDE Firefox Browser MySQL Database
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Yeah, that's the thing. The order of attributes is unimportant so XML software treats it as unimportant. In Java that means some kind of map from names to values. If it's important to you then that's a non-XML requirement and so a non-XML solution would be necessary. I assume that's what you have in mind now?
reply
    Bookmark Topic Watch Topic
  • New Topic