Hi fellas,
I have been having trouble finding quality information on the topic at hand, and I hope this
thread will help everyone who reads and responds to it.
I have a database which contains a fairly large set of financial data and initially I structured the database with organization principles in mind. However, I am slowly realizing that the current structure is inefficient for the task at hand and I'd like to restructure the database with optimal
data extraction as the primary goal, since the data extraction process is currently the bottleneck.
FYI, I'm using MySQL &
JDBC.
Here are three general decisions one might consider when designing a database.
1. Single table with x columns VS. n tables with x/n columns. Comments:
-Extracting data from a single table seems much more efficient than extracting data from multiple tables since mult. tables require
join operations. It has been my experience that joining tables which have a considerable row count is extremely computationally intensive on MySQL.
-However, 800 columns in a single table seems a bit much and at some point I'd think you'd want to disperse the column count into multiple tables.
Questions: Is it more practical and efficient to place all data into one table with x columns or to place all data into multiple tables with x/n columns?
2. Single table with x rows VS. n identical (columns) tables with x/n rows. Comments:
-Similar to above, if you wish to extract all rows, placing as many rows as possible (row limit) in a single table seems more efficient since it would require fewer queries (fewer connection requests and other preparations).
-Another limit seems to be how much memory can you allocated to the extracted data. Regardless, perhaps it's best not to strain MySQL so much.
-For example: Is it adequate to query a table with 1 million rows in a single table? Probably not.
Question: Is it best to store as many rows into one table as possible if the plan is to query data from all rows, or is it best to disperse the rows into multiple tables and query each of these tables separately?
3. Altering MySQL configurations and optimizing JDBC code. Comments:
-Although I have spent little time researching how to modify MySQL configurations to perform certain operations more efficiently, it seems doing so wouldn't produce a drastic speed performance. Am I wrong here and/or are there any good reference anyone would recommend?
-Similarly should and could you optimize you JDBC commands to more efficiently perform extraction operations on a mass set of data? Obviously, we can use prepared statements and things of that nature to speed the process up, but are there certain JDBC commands you can give to MySQL to perform a query in a more efficient manner? Are there references anyone could point me to?
*****
It would be great if there are exact answers to my questions but I doubt this is the case. If anyone could give me some type of guidance however, I'd be much obliged. For example, someone might say, "Generally, in my experience it's best to do this...". Also, any references to related topics would be extremely helpful.
Best Regards,
-Tyler