Win a copy of Rust Web Development this week in the Other Languages forum!
  • Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
  • Tim Cooke
  • Campbell Ritchie
  • Ron McLeod
  • Liutauras Vilda
  • Jeanne Boyarsky
  • Junilu Lacar
  • Rob Spoor
  • Paul Clapham
Saloon Keepers:
  • Tim Holloway
  • Tim Moores
  • Jesse Silverman
  • Stephan van Hulst
  • Carey Brown
  • Al Hobbs
  • Piet Souris
  • Frits Walraven

Partitioning or new schema

Posts: 15
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
I have a schema inside my database with big tables (the schema is composed by 20 tables and each of them contains 500 million rows equivalent about  40GB ). the tables are quite simple, they contain 3 fields: id,value,datetime.
The queries (normally are based on the search in a certain time range one or more parameters values) now are very slow and some improvement is needed, so for the structure of the tables I was thinking about horizontal partitioning by range using the datetime field as element or split the schema in different pieces each of them including data for a certain period of time.
The second solution will create some problems in the software using the database because according to the range time selected by the user, the query must be addressed to a specific DB, containing that period, so if the performances are similar I will prefer the fist one.
What do you think about the performances of the two solutions, is there any other (third) applicable solution?    
Saloon Keeper
Posts: 24812
Android Eclipse IDE Tomcat Server Redhat Java Linux
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Usually you'd partition based on the ID, but if you search more often on time, perhaps that might work. But before you do that, have you tried indexing on the datetime field?

Another reason for partitioning large data based on its timestamp would be if you were doing the majority of your queries over a given time range. In that case, partitioning might make it easier to roll the data offline, or for DBMS's that support it, archival mode (where the old stuff might be run out to tape and restored on-demand).

Certainly it's worth considering whether or not to manually partition by storing data for a separate month or year in its own table, too.
You showed up just in time for the waffles! And this tiny ad:
Building a Better World in your Backyard by Paul Wheaton and Shawn Klassen-Koop
    Bookmark Topic Watch Topic
  • New Topic