I have a lot of questions on designing and implementing many-to-many relationships. It seems logical to me that there would be some book I could buy that would cover these issues but I have not found one yet. Can someone recommend a tutorial that discusses more than one brand of database?
I'm asking these questions for several reasons. Since I'm in between gigs I'm starting a new entrepreneurial effort that might make me some money and (more importantly) look good on my resume and help me brush up my java/database skills. I'm in the process of selecting database vendors, cloud vendors and frameworks (JDBC, springboot or EJB or ?, hibernate-database-first or hibernate-code-first, IBatis etc...)
Here is one sample many to many implementation from Microsoft: Northwind and PUBS sample databases for MSSQL Server. The Pubs database contains an implementation of a authors to book-titles many-to-many relationship I have been playing with.
(Yes, these are microsoft samples, but I'm interested in discussing junction tables in the context of the popular databases including MySQL, Oracle, PostgresSQL and MS SQL Svr).
To simplify things a bit, lets assume that my junction table only needs to contain the foreign key for the book and the foreign key for the author.
(1) What are some good naming conventions for the tables? Do I call the the table author or authors? books or book? What do I call the junction table? book2Author? What about reverse engineering feature in hibernate? I just discovered that has some rules about capitalization but I have not found them yet.
(2) I tend to use the popular approach of auto-increment integers as the primary key (I typically call this field "ID") for my tables. Does using an auto increment field called ID make sense for a junction table? It seems superfluous to me (unless of course, this is a ternary relationship and the auto increment id field of the junction table is going to be a foreign key field in a fouth table).
(3) What should I define as the primary key of a junction table? I think I want the concatenation of the two foreign keys to be unique to preserve referential integrity. Is this true? If so, should I make the primary key consist of the two foreign keys? Yikes -- now I am creating an index structure exclusively for referential integrity that will only hinder performance -- correct? When would I ever use a junction table if I already knew both foreign keys? The only time I would use a junction table is if I know the author_id and want the (multiple) book_id or I know the book_id and want the (multiple) author_ids. Correct? So what is the recommendation? Do I create a primary key consisting of both foreign keys, an index of just the author_id and another index of just the book_id so that each time I insert a row into the junction table I have to update three index structures?
(4) What is the favorite approach to populating these tables with test data? Is there a vendor neutral approach to dealing with SQL insert statements and those pesky auto increment fields? MS SQL Server and Oracle have a feature to allow you to temporarily specify the (otherwise) auto increment id fields of books and authors where you can subsequently hard code these ID fields as insert statements into the junction table (for test data only, of course). Alternatively, they also have features for retrieving the newly generated ID field from the INSERT statement. I think MSSql has the @@IDENTITY feature for this and MySQL has the last_insert_id(). Is there vendor neutral way of doing this? Are there any java/JDBC/hibernate/IBatis features to help me write vendor neutral code?
(5) Now what about production code and transactions? Do I need a transaction for anyone one of the CRUD operations on table that is part of a many-to-many relationship?
(5a) How do I insert into the book table, fetch the newly created integer ID, insert into the author table, fetch the second newly created ID and insert the two new IDs into the junction table? Do I have to create a transaction for this? How do I fetch the newly created IDs? When populating the test data I can hard code the integer ids (maybe) but not in production.
(5b) How do I delete a book? Do I delete the entries in the junction table first and then the book table? Does this need a transaction? What about the cascading delete feature (ON DELETE CASCADE) in MySQL? Do I want to use it? Can JDBC or hibernate or Ibatis help me implement "ON DELETE CASCADE" in a vendor neutral way?
(5c) What about reading? Do I need a transaction to find all the authors of a given book?
(5d) If I am just updating a book, I believe I don't need a transaction (assuming I'm not adding any authors). Is this true? Maybe not! What if I'm trying to update a book that has a single author and someone else is simultaneously trying to delete that book using the cascading delete feature?
If you want to be vendor neutral, then using JPA seems like a good choice. You don't put any SQL statements anywhere, just mapping annotations and (if necessary) JPQL. The latter looks like SQL but is slightly different, and most importantly vendor neutral.
In the background there will be a mapping from annotations, JPQL etc to native SQL statements. That's all done by defining a proper dialect. You would need one for all database systems you want to support. With ORM configuration files you can override any annotation. You will need to do that for the generated primary key, because that's not the same for your database systems (unless you want to use a table for this). Most other things can be left as-is.