GeeCON Prague 2014*
The moose likes JDBC and the fly likes SQL databases Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login


JavaRanch » Java Forums » Databases » JDBC
Bookmark "SQL databases" Watch "SQL databases" New topic
Author

SQL databases

colin shuker
Ranch Hand

Joined: Apr 11, 2005
Posts: 744
Hi, I'm interested to know about performance in sql tables.

I have a table, with 2 bigint columns, and a query of the form "select col2 from table where col1=someValue".
There will be 1 row satisfying the query, or no rows at all.

I currently have 6,800,000 rows of data spread evenly over 10 tables containing 680,000 rows each.

The time for 1 query is 0.1 seconds.

My question is, would it be faster or slower to use 1 large table, or more smaller tables?

I can test it out, but it is time consuming handling such a high quantity of data.

Thanks for any input.
Martin Vajsar
Sheriff

Joined: Aug 22, 2010
Posts: 3610
    
  60

Most probably the time will be nearly the same.

Your query is certainly getting the row via an index. Therefore total size of data is irrelevant, since index lookup time is practically constant. With really small tables your index might (just might) have lower height, saving maybe one logical IO operation, but that probably would not offset additional maintenance of managing 10 tables and partitioning data against them.

That is assuming that you'll know which table would contain the value you're looking up. If it could be any of the ten tables, you'd of course have to probe every one, and the resulting ten queries would probably take about ten times more time than a single query against one big table.

What could speed things up a bit would be if you indexed both of the columns. In some databases the optimizer would figure out it can answer the query directly from index and would not have to read data from the table at all, saving one logical IO for every query. And some databases (eg. Oracle) allow you to create index organized table, so that you can get rid of the unneeded underlying table altogether, saving space and update time.
colin shuker
Ranch Hand

Joined: Apr 11, 2005
Posts: 744
Thanks,

I basically grouped the bigints from the first column into ranges, more or less dividing up the range -2^63 to 2^63-1 into 10 parts,
so I only have to query one table.

I'm not sure what you mean about indexing, do you think I could make it any faster than it is now?
Paul Sturrock
Bartender

Joined: Apr 14, 2004
Posts: 10336

Are you implementing your own table partitioning solution? Are you aware that many databases will do this out the box for you?


I'm not sure what you mean about indexing, do you think I could make it any faster than it is now?

There is one sure way to know - run an explain plan on your query and the database will tell you. There is no 100% guaranteed way of knowing up front if your table structure will produce fast queries - this is very much down to the type of queries (and other processing) you intend to do on your data and the data itself. You need to examine the query logic you would like to run against real data and tune accordingly, anything else could make it slower.


JavaRanch FAQ HowToAskQuestionsOnJavaRanch
Martin Vajsar
Sheriff

Joined: Aug 22, 2010
Posts: 3610
    
  60

colin shuker wrote:I'm not sure what you mean about indexing, do you think I could make it any faster than it is now?

I could help you with this only if you are on Oracle. I don't know other databases well enough to offer more specific help.

If you use Oracle database, post the DDL statements you used to create the table(s) and any indexes on the table(s), and the exact text of the query you use. If I'll see a space for improvement, I'll let you know.
Sunny Bhandari
Ranch Hand

Joined: Dec 06, 2010
Posts: 448

I am not sure if you have used bind variables for your queries.


Java Experience
colin shuker
Ranch Hand

Joined: Apr 11, 2005
Posts: 744
Thanks for replies,

I'm using MySql, I'm getting a time of 0.1 seconds for each SELECT query, which is pretty decent because I was previously using varchar in my table, and it was taking 3.5 seconds for the query.
Martin Vajsar
Sheriff

Joined: Aug 22, 2010
Posts: 3610
    
  60

colin shuker wrote:I'm using MySql, I'm getting a time of 0.1 seconds for each SELECT query, which is pretty decent because I was previously using varchar in my table, and it was taking 3.5 seconds for the query.

ALWAYS use correct data types. If you use incorrect types, an implicit conversion will probably take place somewhere. Then all of sudden your code might stop working when the locale changes, because eg. numbers or dates will be formatted another way and won't match the data in the database any longer. Moreover, with wrong data types, the database might create suboptimal plan (eg. using number for dates) or might be unable to sort the data properly (eg. using varchar for numbers).

In this particular case, I'd guess that using VARCHAR prevented your application from using an index (or you didn't have an index in place). The table was fullscanned (all contents of the table was read) for every execution of your query. Your current solution very probably utilizes an index, vastly reducing the amount of work the database has to do to answer your query. Hence the speedup.
colin shuker
Ranch Hand

Joined: Apr 11, 2005
Posts: 744
I've increased the number of records to 10,000,000. And I've spread this over 64 tables.

The time for a query is now 0.02 seconds, thats what PhpMyAdmin says anyway, I guess this includes time to find the table given in the SELECT query.

My application does seem a little snappier now anyway, so it looks good.
 
GeeCON Prague 2014
 
subject: SQL databases