Meaningless Drivel is fun!*
The moose likes JDBC and the fly likes XML  vs Database Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login


Win a copy of Spring in Action this week in the Spring forum!
JavaRanch » Java Forums » Databases » JDBC
Bookmark "XML  vs Database" Watch "XML  vs Database" New topic
Author

XML vs Database

ashish bhardwaj
Greenhorn

Joined: Jan 27, 2008
Posts: 3
I am looking for views on this:

We have data of size 10 TB(terabytes), stored in multiple disks. Metadata (data describing data like filename, its location, author, description etc.) can go in GB(gigabyes) say 5 GB. To develop a web based application, should metadata be stored in xml files or in a database like oracle, mysql etc.

Since data is going to increase in future, scalability is required. Which approach will give better performance?

It will be like a user wants to find data matching a particular criteria e.g. all files generated between specified start date and end date, extracting required data and analysing it to give statistics, generate plot etc. At runtime, we are generating results, so user should get good performance.

As xml file will be larger, so can't use DOM, but Is using SAX parser scalable and gives good performance?


Thanks
Ashish
Jeanne Boyarsky
author & internet detective
Marshal

Joined: May 26, 2003
Posts: 30752
    
156

Ashish,
Databases are designed for search. There are performance optimizations, such as indexes. While XML allows search, it involves reading the whole file. This is going to be slower than an index.


[Blog] [JavaRanch FAQ] [How To Ask Questions The Smart Way] [Book Promos]
Blogging on Certs: SCEA Part 1, Part 2 & 3, Core Spring 3, OCAJP, OCPJP beta, TOGAF part 1 and part 2
Paul Sturrock
Bartender

Joined: Apr 14, 2004
Posts: 10336

Originally posted by Jeanne Boyarsky:
Ashish,
Databases are designed for search. There are performance optimizations, such as indexes. While XML allows search, it involves reading the whole file. This is going to be slower than an index.


A counter argument would be that the file system plus something like Lucene would give a far quicker (and richer) search capability than an RDBMS can provide.

Replicating an XML document structure in database entites is a lot of maintenance. I'd avoid it if at all possible.

Does your data require any referential integrity or other constraints? If no, then I'd go for the file system every time.


JavaRanch FAQ HowToAskQuestionsOnJavaRanch
Jeanne Boyarsky
author & internet detective
Marshal

Joined: May 26, 2003
Posts: 30752
    
156

Paul,
I interpreted the question differently than you. If it is a matter of leaving the data in XML, it should definitely stay on the file system.

I thought Ashish just had data and had the choice of putting it in XML or in tables.
 
I agree. Here's the link: http://aspose.com/file-tools
 
subject: XML vs Database