The moose likes OO, Patterns, UML and Refactoring and the fly likes Refactoring Very Large DB's Big Moose Saloon
  Search | Java FAQ | Recent Topics
Register / Login


Win a copy of Practical Unit Testing with TestNG and Mockito this week in the Testing forum!
JavaRanch » Java Forums » Engineering » OO, Patterns, UML and Refactoring
Reply Bookmark "Refactoring Very Large DB Watch "Refactoring Very Large DB New topic
Author

Refactoring Very Large DB's

Don Stadler
Ranch Hand

Joined: Feb 10, 2004
Posts: 451
Gentlemen,

I worked on a project which had a single table expected to go to 14 terabytes at a rate of 2 terabytes a year. Are there any design principals which would be useful in avoiding creeping death in this case?
[ July 25, 2006: Message edited by: Ilja Preuss ]
Ken Latta
Greenhorn

Joined: Nov 13, 2004
Posts: 2
Don, is that table just a container for large objects? Or are there just oodles of records being added?
Don Stadler
Ranch Hand

Joined: Feb 10, 2004
Posts: 451
Oodles of records, Ken. Several hundred thousand a day & all have to be kept by contract. We partitioned the table based on whether the record was active or not.
Scott Ambler
author
Ranch Hand

Joined: Dec 12, 2003
Posts: 608
Yes, you probably want to partition this table, otherwise you'll run into performance challenges. Is this just a log? Can you organize into "active right now", "maybe active at some point", and then a bunch of "we'll likely never need these again" tables?

- Scott


<a href="http://www-306.ibm.com/software/rational/bios/ambler.html" target="_blank" rel="nofollow">Scott W. Ambler</a><br />Practice Leader Agile Development, IBM Rational<br /> <br />Now available: <a href="http://www.ambysoft.com/books/refactoringDatabases.html" target="_blank" rel="nofollow">Refactoring Databases: Evolutionary Database Design</a>
Don Stadler
Ranch Hand

Joined: Feb 10, 2004
Posts: 451
Nope it wasn't a log, Scott, but a working table. I wanted to archive the old records but the requirements were developed at a much higher level (they should have shot the architect who put that on). So we partitioned into active and inactive records and twelve streams to boot (with twelve thread parallel processing) to do the batch job. I noticed that they weren't using bind variables and threw those in which raised performance about 30%. This was an Oracle DB - not sure whether that's a Db pattern or an Oracle-specific pattern.

I asked the question to see whether we missed any tricks?
 
 
subject: Refactoring Very Large DB's
 
Threads others viewed
General Database Question
Image insertion into database through JPA
Large Map / HashMap
XML vs Database
XML vs Database
IntelliJ Java IDE