I have a requirement wherein i have to process hundreds of xml.What needs to be implemented is as below:
Create an xml request(I am planning to use jdom here ,do you suggest anyother);
send the request xml as a httpPost to a url(call http connection in a thread)
Recieve the response xml and process it.
The question is on step 3 of above- how to extract data from specific nodes of the response xml.
Question 1>Should i unmarshall to an object or use someother technique?
Question 2>If i have to unmarshall, should i use castor or anyother latest technology.?
Remember i have to process hundreds of xml and performance and
scalability are my major considerations.Any thoughts of best practices?
Regards,
GR
Ivan Krizsan
Bartender
Joined: Oct 04, 2006
Posts: 2193
posted
0
Hi!
If you only need to do one pass over an XML response to get the data, then you can use an event-based XML parser, like StAX.
Such a parser will be both faster and more memory efficient compared to parsers that build an object-tree in memory.
However, I do suggest to create sample programs that you try out on real data or data that is very similar to real data.
I have heard that there may even be differences between different event-based parsers depending on the size of the data to parse.
Perhaps you will even use one parser for smaller messages and another for large messages.
Finally: Design your program so that you can easily switch between different parsing technologies. This way you can always correct any mistakes later.
Best wishes!
Sure i will look into Stax and what about creating xml request, should i use jdom or simply use a stringbuilder? Which way has better performance.?
Ivan Krizsan
Bartender
Joined: Oct 04, 2006
Posts: 2193
posted
0
Hi!
I don't dare telling you that you will have the best performance using technique A.
The answer is the classical "it depends".
The best advice I can give is to perform some kind of benchmark using the kind of data, or at least very similar, that you intend to use in a production environment.
That way you will know for sure what is the best decision, given your particular situation.
Best wishes!
I agree. Here's the link: http://ej-technologies/jprofiler - if it wasn't for jprofiler, we would need to
run our stuff on 16 servers instead of 3.