I would like to know the best practice: I have two entities - Person and Address (relationship - for example: one Person has one Address). Both have its own primary key (id - auto generate). Now I create a Person with an Address and persist. Then (after weeks) I create another Person with the same Address and persist. But: the JPA does know nothing about the equality - so it generates an unique Address-id and in the Address table are now two records, where the only different is the id - primary key. How to solve this issue? The requested behavior is following: two records in the Person table and one record in the Address table + in the both records in the Person table the same references to Address.
I think, I should change the mean of generating the Address-Id: by using a query, which find the same entity (if exists) and only joins the reference to the Person (if does not exist - auto generate). The result: two Person and one Address. But this mean is quite silly. Do you know some better practice?
In this particular case I think you should leave things the way they are. That is because it is very difficult to tell whether two addresses are "the same". There are companies who do nothing but clean up people's address databases by looking for cryptic duplicates.
For example you would need to treat addresses like "123 West 18th Ave" and "123 W 18th Avenue" and "123 18th Ave W" and many other variations as "the same". Perhaps you already have a rigorous standardization process applied to your addresses before they are input -- but I doubt it, because even this is very difficult to achieve.
(Note that I'm specifically talking about Address entities here; for other entities you might want to do what you suggested and eliminate duplicates.)