File APIs for Java Developers
Manipulate DOC, XLS, PPT, PDF and many others from your application.
http://aspose.com/file-tools
The moose likes OO, Patterns, UML and Refactoring and the fly likes DAO pattern with link tables - looking for canonical way forward Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login


Win a copy of Soft Skills this week in the Jobs Discussion forum!
JavaRanch » Java Forums » Engineering » OO, Patterns, UML and Refactoring
Bookmark "DAO pattern with link tables - looking for canonical way forward" Watch "DAO pattern with link tables - looking for canonical way forward" New topic
Author

DAO pattern with link tables - looking for canonical way forward

Edward Winchester
Ranch Hand

Joined: Sep 16, 2009
Posts: 65
Hello,

My first post here. A lot of Google searches I've been doing on various Java topics seem to return results here so thought I might as well join!

I'm writing a Java web application. I'm using Spring Web MVC and employing (or at least trying to employ) the DAO pattern to some degree. I am using JDBC calls within the DAOs and don't really want to take on extra knowledge/libraries for persistence at the moment.

I have a domain object called Person. A person can have zero or more roles. They could also have multiple email addresses, and multiple other things, but let's just stick with the roles for simplicity of explaining my "problem".

I have a PersonDAO like follows:



This is great - there's a separation between the domain object and how it's persisted.

In the database, the information is split across more than one table. The PERSON table holds the non-repeating stuff (id, username, encrypted password, DOB, first name, second name, etc.). There is a PERSON_ROLE table that holds the roles for the person.

Now I need to create a concrete JDBC-backed implementation of the PersonDAO interface: PersonDAOImpl. This is as far as I've got:



It is unfeasible for the findAll() method to fully construct Person objects because it'll need to do at least as many queries on the PERSON_ROLE table as there are records in the PERSON table, so what should I do? I've been thinking of a few ways forward, but wondered if people could point me towards the canonical way of dealing with this?

Thanks,

PUK
Sean Clark
Rancher

Joined: Jul 15, 2009
Posts: 377

Hey,

You can do an sql query that will join the 2 tables and this will allow you to get all the information out in 1 SQL statement so that you can fill your objects.

w3schools has a good tutorial on this: http://www.w3schools.com/Sql/sql_join.asp

Sean


I love this place!
Jimmy Clark
Ranch Hand

Joined: Apr 16, 2008
Posts: 2187
The better method is to use a ORM framework such as Hibernate and avoid "hand-written" SQL statements.
Edward Winchester
Ranch Hand

Joined: Sep 16, 2009
Posts: 65
Sean Clark wrote:Hey,

You can do an sql query that will join the 2 tables and this will allow you to get all the information out in 1 SQL statement so that you can fill your objects.

w3schools has a good tutorial on this: http://www.w3schools.com/Sql/sql_join.asp

Sean


Thanks for the reply. Something for me to bear in mind, but I would worry that would get unscalable: especially when a person is linked to other things too. There'd be a lot of redundant data and nulls in the result set.

Been thinking about it and the function findAll() is probably where the problem lies. I think the cleanest solution would be to have a domain object like BasicPerson (for want of a better name) to just contain some pertinent info about the person. Then I can have either a BasicPersonDAO which can provide the findAll(), or have a findAllBasic() in the PersonDAO.

Thoughts welcome...

PUK
Edward Winchester
Ranch Hand

Joined: Sep 16, 2009
Posts: 65
James Clark wrote:The better method is to use a ORM framework such as Hibernate and avoid "hand-written" SQL statements.

This may sound ridiculous but I welcome comments on it: One of the reasons I've been avoiding an ORM framework is because I've spent so much time learning other stuff already that I'm not actually getting anywhere with the project. Are ORM frameworks designed to help with situations I'm talking about? For instance, the ability to return thousands of Person objects in just one SQL query and when attributes of that person object are requested that aren't in the main table, then to do the "sub" queries? If that's a fundamental aspect of them, then I might be interested in looking further!

If their main benefit is to avoid having hand-written SQL statements, then I'm not that interested right now. For the moment I'm happy to hand-crank the SQL. The reason is my brain probably can't absorb much more until I get some of the app written!

Thanks,

PUK.
Edward Winchester
Ranch Hand

Joined: Sep 16, 2009
Posts: 65
P Uk wrote:
James Clark wrote:The better method is to use a ORM framework such as Hibernate and avoid "hand-written" SQL statements.

This may sound ridiculous but I welcome comments on it: One of the reasons I've been avoiding an ORM framework is because I've spent so much time learning other stuff already that I'm not actually getting anywhere with the project. Are ORM frameworks designed to help with situations I'm talking about? For instance, the ability to return thousands of Person objects in just one SQL query and when attributes of that person object are requested that aren't in the main table, then to do the "sub" queries? If that's a fundamental aspect of them, then I might be interested in looking further!


Answering my own question here... Looks like Hibernate 3 supports this "lazy loading" by default rather than the eager loading. Source: http://en.wikipedia.org/wiki/Hibernate_(Java)#Persistence. Can't picture at the moment how this'll work with my Person POJO at the domain level... I'll check it out further...

Thanks,

PUK
Sean Clark
Rancher

Joined: Jul 15, 2009
Posts: 377

Something for me to bear in mind, but I would worry that would get unscalable: especially when a person is linked to other things too. There'd be a lot of redundant data and nulls in the result set.


While I agree that this is quite unscalable, I'm not quite so sure about the redundent data you are talking about, if you are adding it to objects anyway.

And I certainly wouldn't advise splitting the data up and having a BasicPerson object, to me that just makes no sense.

Personally if you are doing it this way I'd say to use joins.

Are ORM frameworks designed to help with situations I'm talking about?

ORMs are most helpful in this situation, in hibernate for example you create XML files that map a Java class to a database table, you can also define joins between the tables.
Hibernate also uses joins to get the data but it can also be configured to do subselects e.g. if you called user.getRole() it would hit the database get the role and return. Hibernate also does loads of other things but someone else can probably describe that better!

Sean
David Newton
Author
Rancher

Joined: Sep 29, 2008
Posts: 12617

There are a lot of ways to handle getting joined data.

Even with Hibernate it's often necessary to solve the same problem if the queries are very expensive or you're retrieving a lot of data.

One way to handle it is to create a mechanism for determining which extra data you need. It's possible to do something pseudo-coded like this:This build up the query and executes it, returning the result. *How* to implement this depends a lot on the underlying framework and classes, but you get the idea. It avoids a lot of methods like findByIdWithRolesAndEmails() etc., which quickly grows unmanageable.
Hong Anderson
Ranch Hand

Joined: Jul 05, 2005
Posts: 1936
P Uk wrote:
Sean Clark wrote:Hey,

You can do an sql query that will join the 2 tables and this will allow you to get all the information out in 1 SQL statement so that you can fill your objects.

w3schools has a good tutorial on this: http://www.w3schools.com/Sql/sql_join.asp

Sean


Thanks for the reply. Something for me to bear in mind, but I would worry that would get unscalable: especially when a person is linked to other things too. There'd be a lot of redundant data and nulls in the result set.

I recommend to just use SQL JOIN.

Regarding scalability, it depends on the number of person. If there are a lot of person, you should not have findAll(), but should use a pagination method like findAll(int limit, int offset). I don't know what you mean by redundant data and nulls.


SCJA 1.0, SCJP 1.4, SCWCD 1.4, SCBCD 1.3, SCJP 5.0, SCEA 5, SCBCD 5; OCUP - Fundamental, Intermediate and Advanced; IBM Certified Solution Designer - OOAD, vUML 2; SpringSource Certified Spring Professional
Edward Winchester
Ranch Hand

Joined: Sep 16, 2009
Posts: 65
Kengkaj Sathianpantarit wrote:
I recommend to just use SQL JOIN.

Regarding scalability, it depends on the number of person. If there are a lot of person, you should not have findAll(), but should use a pagination method like findAll(int limit, int offset). I don't know what you mean by redundant data and nulls.


Appreciate the reply. I'll hopefully show what I mean by redundant data and nulls:

Imagine a PERSON table with id, username, encrypted password, first name, second name, dob, etc.
Imagine a PERSON_ROLE table with person_id, role_ref_id. N.B. role_ref_id links into the role_ref table not shown here.

findAll() would be getting all the data back (which is all the persons and their roles) in one SQL query.

If we have 3 entries in PERSON and each person has 2 roles, then this is the result set:



The repitition of most of the data apart from the change in ROLE_CODE is undesirable. Consider if a person has multiple email addresses, and multiple other things. Imagine what the result set would look like then. If a person has more roles than they have email addresses, then nulls will take the place of the slots where the email addresses aren't (probably depending upon how the query is formed).

PUK
Jimmy Clark
Ranch Hand

Joined: Apr 16, 2008
Posts: 2187
This relational model looks sloppy ROLE_CODE should not be in this table. It should be in another table with a link to PERSON.ID column.

Your "repetition" of data stems from a sloppy relational design. If you improve this a little, then you eliminate the redundant data and
will be able to see better SQL statements in the future.
Edward Winchester
Ranch Hand

Joined: Sep 16, 2009
Posts: 65
James Clark wrote:This relational model looks sloppy ROLE_CODE should not be in this table. It should be in another table with a link to PERSON.ID column.

Your "repetition" of data stems from a sloppy relational design. If you improve this a little, then you eliminate the redundant data and
will be able to see better SQL statements in the future.

The data presented was the result set from a SQL JOIN query. ROLE_CODE is in another table with a link to PERSON.ID.
Jimmy Clark
Ranch Hand

Joined: Apr 16, 2008
Posts: 2187
I see.

It is difficult to associate a realistic business requirementt that would ask for "all the persons and their roles." Sometimes repeated data is unavoidable however. Either way, you would not be sending the data returned from the SQL statement directly to the user. The application should ignore repeated data.
Jelle Klap
Bartender

Joined: Mar 10, 2008
Posts: 1836
    
    7

If you want to avoid having to learn yet another framework (ORM solution), and stick with plain JDBC instead, I'd suggest you take a look at the Interpreter pattern, like David Newton suggested. In addition to giving you more control over what is retrieved from the database and what is not, by dynamically building up the SQL join query, it will also keep the DAO interface clear and concise. It may not be as ideal a solution as the lazy loading approach offered by most ORM frameworks, but if you need to move forward now, using technologies that you are already familiar with, it might be a good fit.


Build a man a fire, and he'll be warm for a day. Set a man on fire, and he'll be warm for the rest of his life.
Edward Winchester
Ranch Hand

Joined: Sep 16, 2009
Posts: 65
James Clark wrote:
It is difficult to associate a realistic business requirementt that would ask for "all the persons and their roles."

I agree.

One reason why I'm considering another domain object like BasicPerson. Getting all the people, or getting all the people who are administrators, or all the people who are older than a particular age could be valid business requirements. In some ways it seems a bit of a fudge because people who work in that domain wouldn't see the distinction between a BasicPerson and a Person. It looks like an implementation detail (using a relational database) has bubbled up to the domain level.

PUK.
Edward Winchester
Ranch Hand

Joined: Sep 16, 2009
Posts: 65
Jelle Klap wrote:If you want to avoid having to learn yet another framework (ORM solution), and stick with plain JDBC instead, I'd suggest you take a look at the Interpreter pattern, like David Newton suggested. In addition to giving you more control over what is retrieved from the database and what is not, by dynamically building up the SQL join query, it will also keep the DAO interface clear and concise. It may not be as ideal a solution as the lazy loading approach offered by most ORM frameworks, but if you need to move forward now, using technologies that you are already familiar with, it might be a good fit.

Thanks Jelle (and Dave). I may look into the pattern.

I'm also doing a bit of investigation into ORM frameworks. It appears to a newbie like me that Sun/Java has caught up and made persistence a lot more straightforward than it was before. Java Persistence API with EJB3?? Sun says* that a lot of what made Hibernate stand out from previous Sun persistence solutions has now been incorporated into the JPA. Is there much point in looking at anything apart from JPA/EJB3?

PUK

* source: http://java.sun.com/javaee/overview/faq/persistence.jsp
David Newton
Author
Rancher

Joined: Sep 29, 2008
Posts: 12617

Hibernate has more functionality than JPA. EJB3 is a different issue altogether--and while it's significantly better than previous versions, if you don't need it, why learn it?
Jelle Klap
Bartender

Joined: Mar 10, 2008
Posts: 1836
    
    7

Well, you should view JPA as a specification. Adopting the reference implementation to that specification, Orcalce Toplink - now EclipseLink, would be one way to go. That particular implementation, being the reference implementation for JPA 1.0, will be fully compatible and in fact expand upon the capabilities of the JPA specification. However, the RI is not the only implementation of the JPA specification. Hibernate is another JPA 'provider' that is fully compliant with the JPA specification, but also expands upon its capabilities. As long as you don't need to move beyond the limitations of the JPA specification, it shouldn't really matter which provider you pick - technically. However, even then you'll not avoid having to read up on some of the details of how to use that specific implementation. Of course, understanding the basic principles of O/R mapping wouldn't hurt either. Long story short, you can't view JPA as deseperate from ORM and ORM frameworks (JPA providers) like Hibernate.
Hong Anderson
Ranch Hand

Joined: Jul 05, 2005
Posts: 1936
Well, I think you should define what is the problem first. If you think redundant data is the problem, ORM doesn't help anything.
Data will be redundant anyway for one-to-many relations because ORM also uses SQL JOIN (in case you want data from different tables). ORM helps to map relational data to objects, it does what you're doing manually.
Edward Winchester
Ranch Hand

Joined: Sep 16, 2009
Posts: 65
David Newton wrote:Hibernate has more functionality than JPA. EJB3 is a different issue altogether--and while it's significantly better than previous versions, if you don't need it, why learn it?

OK, thanks for clarification. I'll ignore EJB, JPA, Hibernate, etc. for now and review them after I've done this project I'm on. Then I'll have a good handle of what was difficult and what I would like to hand-off to frameworks/libraries in future projects.
Edward Winchester
Ranch Hand

Joined: Sep 16, 2009
Posts: 65
Jelle Klap wrote:Well, you should view JPA as a specification.
<snip>

Thank you. Part of the challenge for a newbie (not new to programming, but new to Java EE stuff) is figuring out what is interface/specification, what is implementation, where all these technologies sit with each other, how they complement and compete. Seems there's always a library/framework offering a better way of doing things than the current method! Luckily I like acronyms

As I said to David, I'll look at JPA, etc later on...



Edward Winchester
Ranch Hand

Joined: Sep 16, 2009
Posts: 65
Kengkaj Sathianpantarit wrote:Well, I think you should define what is the problem first. If you think redundant data is the problem, ORM doesn't help anything.
Data will be redundant anyway for one-to-many relations because ORM also uses SQL JOIN (in case you want data from different tables). ORM helps to map relational data to objects, it does what you're doing manually.

Thank you for the advice. You're right about defining the problem.

This thread has helped me to find a way forward and have confidence in that path.

Hong Anderson
Ranch Hand

Joined: Jul 05, 2005
Posts: 1936
I recommend to have a look at iBATIS Data Mapper Framework, it's no ORM. Using iBATIS you still write SQL, but write in XML and iBATIS will map result set to objects for you.
It's a simple yet powerful framework, far more simpler than ORM solutions. You just need to know XML, SQL, and Java objects which are the things that every Java developer should know.
Ben Narendren
Greenhorn

Joined: Oct 01, 2009
Posts: 19
Hibernate is a very good bet for your situation. The simple answer to your question is to apply the Data Mapper pattern (refer Patterns of EAA by Fowler). But instead of reinventing the wheel, you might as well use Hibernate which has a very good implementation of Data Mapper. Hibernate supports

1) Lazy Loading
2) You can specify which objects you want to eager fetch
3) If these objects have a child hierarchy you can setup the depth to which you want to fetch
4) In addition, Hibernate gives you a lot of out of the box functionality which you will need implement a DAO pattern effectively
Edward Winchester
Ranch Hand

Joined: Sep 16, 2009
Posts: 65
Ben Narendren wrote:Hibernate is a very good bet for your situation. The simple answer to your question is to apply the Data Mapper pattern (refer Patterns of EAA by Fowler). But instead of reinventing the wheel, you might as well use Hibernate which has a very good implementation of Data Mapper. Hibernate supports

1) Lazy Loading
2) You can specify which objects you want to eager fetch
3) If these objects have a child hierarchy you can setup the depth to which you want to fetch
4) In addition, Hibernate gives you a lot of out of the box functionality which you will need implement a DAO pattern effectively


Thanks Ben. For this project I've proceeded with JDBC backed DAO classes and introduced a BasicPerson object for when I want to retrieve a basic object without any of the joined data. This situation is working well for me right now.

When I finish I'll have definitely look at Hibernate and iBATIS, etc. to see if they can help on subsequent projects. I'll feel more comfortable then with all the other new concepts I'm currently working on so there'll be less "new stuff" to learn which is definitely a good thing!

Ed

 
I agree. Here's the link: http://aspose.com/file-tools
 
subject: DAO pattern with link tables - looking for canonical way forward