Win a copy of TensorFlow 2.0 in Action this week in the Artificial Intelligence and Machine Learning forum!
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other all forums
this forum made possible by our volunteer staff, including ...
  • Campbell Ritchie
  • Liutauras Vilda
  • Paul Clapham
  • Bear Bibeault
  • Jeanne Boyarsky
  • Ron McLeod
  • Tim Cooke
  • Devaka Cooray
Saloon Keepers:
  • Tim Moores
  • Tim Holloway
  • Jj Roberts
  • Stephan van Hulst
  • Carey Brown
  • salvin francis
  • Scott Selikoff
  • fred rosenberger

Performance of parsing XML?

Posts: 19
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
I am writing a simple channel for my company's portal. Basicly the channel get job advertising from a companys's internal server via http call. Since the channel will get very high traffic, I hope it as thin as possible.
When user clicks the "serach" button, the channel will eat in 5-10 job feeds (as separate XML stream). A single feed can contain no job or 100 jobs. The feeds have to be assembled for transformation.
1. Concantenate the feeds using StringBuffer().
2. Parse the feeds and generate a now DOM.
Which is better considering the performance? Any other solutions?
Ranch Hand
Posts: 171
  • Mark post as helpful
  • send pies
  • Quote
  • Report post to moderator
Hi Greg
I have done something similar in the past. The biggest bottleneck to the performance is likely to be the speed of your connection to get the data from the various feeds rather than the time taken to parse the returned information.
A lot will depend upon the datathat you get back from the feeds. If all of the feeds are using a common DTD/Schema then it shouldn't be too difficult. JAXP now allows you to create and cache transformers which will at least means that you will only need to parse the XSLT once.
The best option I can suggest is that you try both methods and see what the performance is like but I would definitely cache the parsed output so that you don't need to go to the source of the feeds for each incoming client request. We added a feature that would cache the incoming feed and/or the generated content for a fixed period of time e.g 30 minutes or 24 hours depending upon the voltility of the source information.
With a little knowledge, a cast iron skillet is non-stick and lasts a lifetime.
    Bookmark Topic Watch Topic
  • New Topic