ashish bhardwaj

Greenhorn
+ Follow
since Jan 27, 2008
Merit badge: grant badges
For More
Cows and Likes
Cows
Total received
0
In last 30 days
0
Total given
0
Likes
Total received
0
Received in last 30 days
0
Total given
0
Given in last 30 days
0
Forums and Threads
Scavenger Hunt
expand Ranch Hand Scavenger Hunt
expand Greenhorn Scavenger Hunt

Recent posts by ashish bhardwaj

I am looking for views on this:

We have data of size 10 TB(terabytes), stored in multiple disks. Metadata (data describing data like filename, its location, author, description etc.) can go in GB(gigabyes) say 5 GB. To develop a web based application, should metadata be stored in xml files or in a database like oracle, mysql etc.

Since data is going to increase in future, scalability is required. Which approach will give better performance?

It will be like a user wants to find data matching a particular criteria e.g. all files generated between specified start date and end date, extracting required data and analysing it to give statistics, generate plot etc. At runtime, we are generating results, so user should get good performance.

As xml file will be larger, so can't use DOM, but Is using SAX parser scalable and gives good performance?


Thanks
Ashish
Hi Paul,
It will be like a user wants to find data matching a particular criteria e.g. all files generated between specified start date and end date,
extracting required data and analysing it to give statistics, generate plot etc.

Will database approach will give good performance?
As xml file will be larger, so can't use DOM, but Is using SAX parser scalable and gives good performance?



Thanks
Ashish
I am looking for views on this:

We have data of size 10 TB(terabytes), stored in multiple disks. Metadata (data describing data like filename, its location, author, description etc.) can go in GB(gigabyes) say 5GB. To develop a web based application, should metadata be stored in xml files or in a database like oracle, mysql etc.

Since data is going to increase in future, scalability is required. Which approach will give better performance?

Thanks
Ashish