posted 17 years ago
Hi,
I have a web application that needs to accept an XSD and a corresponding XML as input via form parameters.
I need to parse the XSD, identify the simple and complex types and other necessary information, and represent this in some sort of structural format (preferably a tree hierarchy)
This structural information will then be used to parse the corresponding XML file, record by record into a CSV file.
The XSD will be using only a limited set of XSD constructs (namespaces and imports can be ignored for the time being)
What is the best way that I can go forward with this?
Options considered:
XSD-Java binding tools (XMLBeans, JAXB).
I can get a type hierarchy using these tools.
countries
-country
-id
-name
-states
-state
-id
-name
This would be the ideal scenario as I can create a new instance hierarchy based on the above type hierarchy (acts as an intermediate in memory representation), populate the simple types (or attribute) with their respective values as I parse through the XML and write it to the file. However, due to the high number of class files that are going to be generated on the system, this option cannot be considered. (For each XSD uploaded, the system would have to generate a set of class files and this is not acceptable)
XSOM
Using XSOM, i can create a simple tree structure with two user defined types 'ComplexType' and 'SimpleType'
ComplexType
Name
Set of simple types
Set of complex types
SimpleTypes
Name
Value
The problem here is I am not sure how I can go about parsing the XML and having an intermediate representation which could be committed to the CSV file. DOM is not an option at all due to memory constraints.
Example:
XML
CSV
Awaiting your feedback and comments on this. I have been burning my head for the past few days trying to figure out the most efficient and scalable solution to this.
Many thanks,
J