File APIs for Java Developers
Manipulate DOC, XLS, PPT, PDF and many others from your application.
The moose likes XML and Related Technologies and the fly likes XML document parsing. Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login

Win a copy of Java Interview Guide this week in the Jobs Discussion forum!
JavaRanch » Java Forums » Engineering » XML and Related Technologies
Bookmark "XML document parsing." Watch "XML document parsing." New topic

XML document parsing.

Jason Pepper

Joined: Oct 15, 2005
Posts: 9
Here is what I am trying to do, I need to know the tags supported by parsing
an xsd document. The xsd document unfortunately is not a monolith and is broken into several includes. I tried dom4j, jdom and apache xerces SAXParser
and it looks like I can see that the parser recognizes the includes but does not actually load the include inline (as I would expect) for me to know the
set of tags.

SAXParser parser = new SAXParser();
parser.setDocumentHandler(new DocHandler());
parser.setContentHandler(new ContentHandler());

my.xsd is actually

<?xml version="1.0" encoding="UTF-8"?>
<xs:schema xmlns:xs=""
<xs:include schemaLocation="../../my_other1.xsd"/>
<xs:include schemaLocation="../../my_other2.xsd"/>

It looks like the two xs:includes do not get exploded in-line.

Is that the expected behaviour? How does schema validation work because
at some point something is exploding these includes?
Paul Clapham

Joined: Oct 14, 2005
Posts: 19973

The parser does not apply any meaning to the element names when it parses an XML file. And that's exactly what your code does there: it tells the parser to parse an XML file. So it sees the element name "xs:include" and treats it just any old element name, no more than that. So yes, that's the expected behaviour.

Now, when you parse some other XML file, and tell the parser that your file is the schema that the other XML file is supposed to conform to, then the parser understands that it's dealing with a schema. In this case when it sees the element name "xs:include" then it understands that it must go and find some other files and do something with them.
I agree. Here's the link:
subject: XML document parsing.
It's not a secret anymore!