File APIs for Java Developers
Manipulate DOC, XLS, PPT, PDF and many others from your application.
http://aspose.com/file-tools
The moose likes JDBC and the fly likes SQL -SELECT on multiple tables overhead Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login


Win a copy of Java 8 in Action this week in the Java 8 forum!
JavaRanch » Java Forums » Databases » JDBC
Bookmark "SQL -SELECT on multiple tables overhead" Watch "SQL -SELECT on multiple tables overhead" New topic
Author

SQL -SELECT on multiple tables overhead

Jonas Vikis
Greenhorn

Joined: Feb 17, 2006
Posts: 4
Hi everyone,

I got a little problem with some joins I�m trying to do in MySQL 5.0.
The setup is as follows:
3 tables - parent_table(pk, data), child_table1(fk_parent, pk, data), child_table2(fk_parent, pk, data).

What do I want to accomplish? Well in my java-code I have a TransferObject
that reflects this structure - Parent, Child1, Child2. Parent has besides it data variable the two child objects.

The easy way is to do a LEFT JOIN i.e:
SELECT p.*, c1.pk, c1.data, c2.pk, c2.data
FROM parent_table p
LEFT JOIN child_table1 c1
ON p.pk = c1.fk_parent
LEFT JOIN child_table2 c2
ON p.pk = c2.fk_parent

In a sense this works, but the resultset overhead is huge because of the duplicate rows that is generated when there are rows in the child tables pointing to a single pk in the parent_table.

Is there any way around it that I�m missing? My options for the moment is to do multiple queries in the java-code and write the joinlogic in java instead.


Any sugestions are welcome - (except 'switch to Hibernate!' )
Cheers!

/Jonas
Jeff Albertson
Ranch Hand

Joined: Sep 16, 2005
Posts: 1780
Shouldn't your domain object Parent has a collection of Child1's and another collection of Child2's ?

[ February 17, 2006: Message edited by: Jeff Albertson ]

There is no emoticon for what I am feeling!
Bruno Boehr
Greenhorn

Joined: Feb 15, 2006
Posts: 17
Well, the thing is, for every entry in the parent table you get a Cartesian product of all c1 and c2 records associated with it. Here is an example. Suppose we have a parent record p1 that has dependent entries:

in child1:

10
20

in child2

a
b
c

Now, given your tuple format (p c1 c2), how would you expect the result set rows to appear? Do you think it will be:

p1 10 a
p1 20 b
p1 - c

No, this is not what the semantics of your query is. The semantics is to produce:

p1 10 a
p1 10 b
p1 10 c
p1 20 a
p1 20 b
p1 20 c

I understand what you are trying to achieve: initialize all parent objects with their children in one sql query. Well, given your table relationships this might not be the ideal solution. Some of the options you may want to consider would include:

- Process the parent table first instantiating parent objects; then process the bulk of child1, for each of them look up in memory their respective parent and tie them together, then do the same for child2. This will yield three queries altogether.

- Go through records in the parent table; for each parent retrieve its dependent children in child1, then in child2. This will result in (2*N + 1) queries (where N is the number of parent records).

- Same as above, but defer children instantiation till they are actually requested (lazy loading).

These are the options I can think of given your descriptions, but maybe better solutions exist.


<a href="http://webjavenue.com/" target="_blank" rel="nofollow">Your first website in Java: easier than you think</a>
Jeff Albertson
Ranch Hand

Joined: Sep 16, 2005
Posts: 1780
The SQL in the original post was doing a proper join, not a cartesian product!

But I think the error was in turning every row in the select into one separate object, instead of creating parent and child objects and associating the appropriately.
Bruno Boehr
Greenhorn

Joined: Feb 15, 2006
Posts: 17
> The SQL in the original post was doing a proper join, not a cartesian product!

It is a proper join to the effect of connecting c1 and c2 to the parent. Yet, given two mutually independent sets of records in c1 and c2 (within a particular parent realm), we still get a Cartesian of those sets.
Jonas Vikis
Greenhorn

Joined: Feb 17, 2006
Posts: 4
In answer to Jeff:

You�re quite right, I use collections exactly as you stated/suggested, but since my english is a bit rusty, I didn�t use the proper terms

As for the question in hand, maybe I should give you some more details to why the query gives me headaches...
I�m trying to use a special pattern in the business tier where I cache the ResultSet in an objectarray(which I pass on to the calling class), thus letting me close the JDBC connection much faster. Then I�m using a TransferObjectAssembler to set the data in TOs upon request from the MVC-framework since it's all a part of a rather complex search, and the results could be fairly large.

So after some brainstorming, I come up with two alternatives. The one stated or do as suggested - first call the parent(s) then call the the childrens and through java-logic bring them together.
But then I couldn�t really use the design since that would leave me with a lot of cached ResultSets instead, or leaving my connector class to do the assembling.

But as I�m only human, I might have missed something obvious...

Any suggestions?

Cheers
/Jonas
Jonas Vikis
Greenhorn

Joined: Feb 17, 2006
Posts: 4
In answer to Bruno:

Yup, I�m aware of the cartesian result. That�s what I�m trying to get away from.

Lazy loading would be the most easiest way to get around it, but it�s not possible, because of the fact that the design states that if(children > 0)
then the result must be joined in the view. In other words the collections of childrens are shown together with their respective parents.
This wouldn�t be such a hassle if it weren�t for the fact that a regular client query would generate a result of multiple parents shown together...

A typical query would generate 0 to 10 parents, with 0 to 50 child1 and 0 to 5 child2.
If you do the math for the extreme case - that�s either a lot of queries or a somewhat large ResultSet in comparison to the actual size.


Cheers
/Jonas
Jonas Vikis
Greenhorn

Joined: Feb 17, 2006
Posts: 4
Answering my own post... Again.

Anyway, thanks for your answers, they helped me in the right direction.

I managed to come up with a solution which is much like the suggested one, calling them separately (sort of). With some help with my trusted StringBuffer, amongst other, i succeeded to cut queries down to exactly 3 for this situation. Thus leaving the assembling part to java-code, unfortunately not using my assembler or cache

So if anyone is interested in the stub-code I can post it here for you to laugh at, criticize or whatever.

Cheers
/Jonas
[ February 20, 2006: Message edited by: Jonas Vikis ]
 
I agree. Here's the link: http://aspose.com/file-tools
 
subject: SQL -SELECT on multiple tables overhead
 
Similar Threads
Saving Parent having multiple Child row refering Multiple parents gives Foreign Key constrain
recursive retrieval
How to create a new object with given superclass reference?
why Finalize() method is protected..?
Best way to join a parent with many child tables yet preserving all parent table rows