File APIs for Java Developers
Manipulate DOC, XLS, PPT, PDF and many others from your application.
http://aspose.com/file-tools
The moose likes Object Relational Mapping and the fly likes How to avoid duplicate inserts in JPA HQL (EJB 3.0 - Java EE 5) Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login
JavaRanch » Java Forums » Databases » Object Relational Mapping
Bookmark "How to avoid duplicate inserts in JPA HQL (EJB 3.0 - Java EE 5)" Watch "How to avoid duplicate inserts in JPA HQL (EJB 3.0 - Java EE 5)" New topic
Author

How to avoid duplicate inserts in JPA HQL (EJB 3.0 - Java EE 5)

Jack Bush
Ranch Hand

Joined: Oct 20, 2006
Posts: 235
Hi All,

I would like to find out how to efficiently avoid duplicate inserts in JPA (EJB 3.0 - Java EE 5) & HQL. Below is the various EAR components setup in Netbeans 6.7 on Windows XP:

@Stateless
public class CustomerBean implements CustomerRemote {

@PersistenceContext(unitName="CustomerProducer-ejbPU") private EntityManager manager;

public void createCustomer(Customer customer)
{
manager.persist(customer);
}

public Customer findCustomer(String firstname, String surname, String sex, Date dob)
{
CustomerPK pk = new CustomerPK(firstname, surname, sex, dob);
return manager.find(Customer.class, pk);
}

public List fetchCustomersWithRelationships()
{
Query query = manager.createQuery("SELECT DISTINCT c FROM Customer c LEFT JOIN FETCH c.phones");
List results = query.getResultList();
HashSet set = new HashSet();
set.addAll(results);
Iterator it = set.iterator();
while (it.hasNext()) {
Customer c = (Customer)it.next();
System.out.print(c.getFirstname() + " " + c.getSurname) + " " + c.getSex() + " " c.getDob());
for (Phone phone : c.getPhones()) {
System.out.print(" " + phone.getNumber());
}
System.out.println("");
}
return results;
}
}

public class CustomerAppClient {

@EJB
private static CustomerRemote remoteCustomerbean;
public static void main(String[] args)
{
// Create new Customer
Customer customer = new Customer();
customer.setFirstname(Jack);
customer.setSurname(Bush);
...
remoteCustomerbean.createCustomer(customer);
....
}
}
It appears that by carrying out duplicate record inserts without checking for existence will result with extensive SQL exceptions as well as breaking the original data (couldn't remove the table easily by undeployment/manual deletion), even though it doesn't prevent the code from completing successfully. As a result, I am wondering whether there is a better approach to have access to all the records in the database possibly in a ArrayList so that they could be screened from the Application Client side prior to addition to DB? The challenge that I am encountering is that the fetchCustomersWithRelationships() will crashed with bufferoverflow as follows unless a minimum & maximum record is specified:

[java] javax.ejb.EJBException: nested exception is: java.rmi.MarshalException: CORBA MARSHAL 1398079699 Maybe; nested exception is:
[java] org.omg.CORBA.MARSHAL: vmcid: SUN minor code: 211 completed:Maybe
[java] at ejb._CustomerRemote_Wrapper.fetchCustomersWithRelationships(ejb/_CustomerRemote_Wrapper.java)
[java] at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
[java] at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
[java] at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
[java] at java.lang.reflect.Method.invoke(Method.java:597)
[java] at com.sun.enterprise.util.Utility.invokeApplicationMain(Utility.java:266)
[java] at com.sun.enterprise.appclient.MainWithModuleSupport.<init>(MainWithModuleSupport.java:449)
[java] at com.sun.enterprise.appclient.MainWithModuleSupport.<init>(MainWithModuleSupport.java:259)
[java] at com.sun.enterprise.appclient.Main.main(Main.java:200)
[java] Exception in thread "main" java.lang.RuntimeException: java.lang.reflect.InvocationTargetException
[java] at com.sun.enterprise.appclient.MainWithModuleSupport.<init>(MainWithModuleSupport.java:461)
[java] at com.sun.enterprise.appclient.MainWithModuleSupport.<init>(MainWithModuleSupport.java:259)
[java] at com.sun.enterprise.appclient.Main.main(Main.java:200)
[java] Caused by: java.lang.reflect.InvocationTargetException
[java] at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
[java] at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
[java] at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
[java] at java.lang.reflect.Method.invoke(Method.java:597)
[java] at com.sun.enterprise.util.Utility.invokeApplicationMain(Utility.java:266)
[java] at com.sun.enterprise.appclient.MainWithModuleSupport.<init>(MainWithModuleSupport.java:449)

Thanks in advance,

Jack






Krum Bakalsky
Ranch Hand

Joined: Mar 14, 2010
Posts: 46
Just to be clear:

As far as i can see, you are using container managed transaction demarcation, right ?
And you are using the default REQUIRED attribute, meaning that each method must be executed within a JTA transaction.
And you are using container-managed transaction scoped EntityManager, meaning that it's persistence context will live exactly as the JTA transaction lives.

I see that you are using an application client. Do you start a transaction from it ? In what transactional context do you invoke your findCustomer() business method ?
If the answer is "in none", then invoking this method will start a new transaction, but at the and the transaction will end, and the returned entity instance will be detached.
(Actually, it will be detached just because you are returning it through a remote business interface, meaning that it will be serialized by the container and deserialized at the client side)

Are you sure that you are using everything in the right way ?

Since you are passing your entities through a remote business interface, is your entity class implementing java.io.Serializable ?


SCJP 6 (86% - the hard way), SCBCD 5 (81% - the hard way)
Krum Bakalsky
Ranch Hand

Joined: Mar 14, 2010
Posts: 46
As far as we are talking about duplicate inserts, i suppose that you are talking about "duplicate primary key" problems ?
In this case, you should just take care to specify an adequate primary key generating strategy for your entities.
You can choose TABLE or SEQUENCE.

The only problem will come when a transaction is rolled back, since it will detach your entities:

If there is a new entity that uses automatic primary key generation, there may be a
primary key value assigned to the detached entity. If this primary key was generated
from a database sequence or table, the operation to generate the number may have been
rolled back with the transaction. This means that the same sequence number could be
given out again to a different object. Clear the primary key before attempting to persist
the entity again, and do not rely on the primary key value in the detached entity.


From "Pro EJB 3: Java Persistence API"
Tim Holloway
Saloon Keeper

Joined: Jun 25, 2001
Posts: 16020
    
  20

We have a nice little magic button labeled "Code" in our message editor. If you use that when including code samples, the editor won't reformat your code.

Normally, I want a duplicate record exception to be returned when I attempt to insert a record with a duplicate key (persist). If I want a saveOrUpdate, I use the merge() method, instead, so that the older version of the record is replaced if it exists or if not, a new record is inserted.

Key Generators in JPA can be used to ensure that a truly unique record ID is created.


Customer surveys are for companies who didn't pay proper attention to begin with.
Jack Bush
Ranch Hand

Joined: Oct 20, 2006
Posts: 235
Gentlemen,

Thank you for responding to this post.

Krum Bakalsky wrote:

( a ) >As far as i can see, you are using container managed transaction demarcation, right ?
Yes.

( b ) >And you are using the default REQUIRED attribute, meaning that each method must be executed within a JTA transaction.
Yes.

( c ) >And you are using container-managed transaction scoped EntityManager, meaning that it's persistence context will live exactly as the JTA transaction lives.
Just learnt it from you. Should I be managing the persistence context so that it could be available without the dependency of JTA transaction imposes?

( d ) >I see that you are using an application client. Do you start a transaction from it ?
No. I leave all the transaction to be managed by JTA. Is this not the best approach? What would you suggest?

( e ) > In what transactional context do you invoke your findCustomer() business method ?
I am not sure. Thought container-managed transaction takes care of it. The last thing I want is to run findCustomer() for every records to be added into the database though. How could this be done more efficiently?

( f ) > If the answer is "in none", then invoking this method will start a new transaction, but at the and the transaction will end, and the returned entity instance will be detached.
> (Actually, it will be detached just because you are returning it through a remote business interface, meaning that it will be serialized by the container and deserialized at the client side)

The following error occurred when trying to lookup Customer object with a local business interface from within an application client:

The exception message is: CLI171 Command deploy failed : Deploying application in domain failed; Error loading deployment descriptors for module [CustomerConsumer] -- Target ejb CustomerBean for remote ejb 3.0 reference client.localCustomerproducer/localCustomerbean does not expose a remote business interface of type ejb.CustomerLocal

Do I have a choice?

( g ) > Are you sure that you are using everything in the right way ?

I am open to recommendation on how to improve on the current design. I want to distribute the system load by separating Glassfish AS from Application Client, which is responsible for gathering & preparing object to be persisted.

( h ) > Since you are passing your entities through a remote business interface, is your entity class implementing java.io.Serializable ?
Yes.

( i ) > As far as we are talking about duplicate inserts, i suppose that you are talking about "duplicate primary key" problems ?
Yes, I am referring to duplicate primary key classes/composite keys.
> In this case, you should just take care to specify an adequate primary key generating strategy for your entities.
> You can choose TABLE or SEQUENCE.

I am using IDENTITY (MySQL).

( j ) > The only problem will come when a transaction is rolled back, since it will detach your entities:
> If there is a new entity that uses automatic primary key generation, there may be a
> primary key value assigned to the detached entity. If this primary key was generated
> from a database sequence or table, the operation to generate the number may have been
> rolled back with the transaction. This means that the same sequence number could be
> given out again to a different object. Clear the primary key before attempting to persist
> the entity again, and do not rely on the primary key value in the detached entity.

Will IDENTITY strategy be experiencing the same issue and hence, needs to follow the same suggestion to avoid such pitfall?


Tim Holloway wrote:
( k ) > We have a nice little magic button labeled "Code" in our message editor. If you use that when including code samples, the editor won't reformat your code.
I have been able to used this function in the past. However, it converted the document to XML format in preview this time so I didn't use it this time but will try it again next time round.

( l ) > If I want a saveOrUpdate, I use the merge() method, instead, so that the older version of the record is replaced if it exists or if not, a new record is inserted.
The merge() has eliminated the needs to check for the existence of each record which is half of the battle. At the same time, I would like to prevent the same record being saveOrUpdate (waste of time & resources) if it already exist since this record hasn't been changed. Is this possible?

In short, I have no experience with managing transaction and is looking for advice. The current design came from working examples but may not be the best one. I am also looking for some pros & cons for managing transaction between JTA & non container-managed transaction.

Thanks again,
Jack





Jack Bush
Ranch Hand

Joined: Oct 20, 2006
Posts: 235
Gentlemen,

Thank you for responding to this post.

Krum Bakalsky wrote:

( a ) >As far as i can see, you are using container managed transaction demarcation, right ?
Yes.

( b ) >And you are using the default REQUIRED attribute, meaning that each method must be executed within a JTA transaction.
Yes.

( c ) >And you are using container-managed transaction scoped EntityManager, meaning that it's persistence context will live exactly as the JTA transaction lives.
Just learnt it from you. Should I be managing the persistence context so that it could be available without the dependency of JTA transaction imposes?

( d ) >I see that you are using an application client. Do you start a transaction from it ?
No. I leave all the transaction to be managed by JTA. Is this not the best approach? What would you suggest?

( e ) > In what transactional context do you invoke your findCustomer() business method ?
I am not sure. Thought container-managed transaction takes care of it. The last thing I want is to run findCustomer() for every records to be added into the database though. How could this be done more efficiently?

( f ) > If the answer is "in none", then invoking this method will start a new transaction, but at the and the transaction will end, and the returned entity instance will be detached.
> (Actually, it will be detached just because you are returning it through a remote business interface, meaning that it will be serialized by the container and deserialized at the client side)

The following error occurred when trying to lookup Customer object with a local business interface from within an application client:

The exception message is: CLI171 Command deploy failed : Deploying application in domain failed; Error loading deployment descriptors for module [CustomerConsumer] -- Target ejb CustomerBean for remote ejb 3.0 reference client.localCustomerproducer/localCustomerbean does not expose a remote business interface of type ejb.CustomerLocal

Do I have a choice?

( g ) > Are you sure that you are using everything in the right way ?

I am open to recommendation on how to improve on the current design. I want to distribute the system load by separating Glassfish AS from Application Client, which is responsible for gathering & preparing object to be persisted.

( h ) > Since you are passing your entities through a remote business interface, is your entity class implementing java.io.Serializable ?
Yes.

( i ) > As far as we are talking about duplicate inserts, i suppose that you are talking about "duplicate primary key" problems ?
Yes, I am referring to duplicate primary key classes/composite keys.
> In this case, you should just take care to specify an adequate primary key generating strategy for your entities.
> You can choose TABLE or SEQUENCE.

I am using IDENTITY (MySQL).

( j ) > The only problem will come when a transaction is rolled back, since it will detach your entities:
> If there is a new entity that uses automatic primary key generation, there may be a
> primary key value assigned to the detached entity. If this primary key was generated
> from a database sequence or table, the operation to generate the number may have been
> rolled back with the transaction. This means that the same sequence number could be
> given out again to a different object. Clear the primary key before attempting to persist
> the entity again, and do not rely on the primary key value in the detached entity.

Will IDENTITY strategy be experiencing the same issue and hence, needs to follow the same suggestion to avoid such pitfall?


Tim Holloway wrote:
( k ) > We have a nice little magic button labeled "Code" in our message editor. If you use that when including code samples, the editor won't reformat your code.
I have been able to used this function in the past. However, it converted the document to XML format in preview this time so I didn't use it this time but will try it again next time round.

( l ) > If I want a saveOrUpdate, I use the merge() method, instead, so that the older version of the record is replaced if it exists or if not, a new record is inserted.
The merge() has eliminated the needs to check for the existence of each record which is half of the battle. At the same time, I would like to prevent the same record being saveOrUpdate (waste of time & resources) if it already exist since this record hasn't been changed. Is this possible?

In short, I have no experience with managing transaction and is looking for advice. The current design came from working examples but may not be the best one. I am also looking for some pros & cons for managing transaction between JTA & non container-managed transaction.

Thanks again,
Jack





Krum Bakalsky
Ranch Hand

Joined: Mar 14, 2010
Posts: 46
Hi Jack,

As far as transaction control is concerned, everything depends on the application needs.
In the majority of the cases, container-managed transactions should be chosen, since the
container takes care for a lot of things instead of you. If you use bean-managed transactions,
it is supposed that the developer understands well the JTA API and can achieve his needs
without the help of the container. In some books the authors even go further stating that
only very-experienced developers should use bean-managed transactions.

However, BMT is inevitable if you want, for instance, to trigger a transaction from the
web tier. CMT is only for EJBs. So there are scenarios when the developer to use a BMT
is the only possible way.

If you want an advice, please explain me exactly what you are trying to achieve.

(c) You can achieve a persistence context that will outlive the JTA transactions by two approaches:
using EXTENDED persistence context with a stateful session bean;
using application-managed persistence context (in JavaEE environment you can obtain such a one
through a @PersistenceUnit annotation).


(d) There is no problem not to start a transaction from the client. But you will have to explain your goals if you want to be advised which approach is more suitable.

(f) The deploy of the app client is complaining that most probably you have messed up the business interface types. Check them once again.
(If your app client lives in a separate JVM, then you must use remote interfaces only)

Take a look at this: http://stackoverflow.com/questions/848675/ejb-annotation-in-clients

Are you sure that you have declared correctly your client code as a valid application client ? (maybe this is done through some deployment descriptors, i don't know ) If the container does not recognize it as an app client, then dependency injection is forbidden and you must use look up instead.
http://javahowto.blogspot.com/2009/10/sample-application-clientxml-java-ee-5.html


Why don't you try to look up the remote interface in the client anyways, instead of having it injected ?


Something to add about the IDENTITY strategy:

Another difference, hinted at earlier, between using IDENTITY and other id generation
strategies is that the identifier will not be accessible until after the insert has occurred. While no
guarantee is made as to the accessibility of the identifier before the transaction has completed,
it is at least possible for other types of generation to eagerly allocate the identifier, but when
using identity, it is the action of inserting that causes the identifier to be generated. It would be
impossible for the identifier to be available before the entity is inserted into the database, and
because insertion of entities is most often deferred until commit time, the identifier would not
be available until after the transaction has been committed.


This is from "Pro EJB 3: Java Persistence API" book.
Krum Bakalsky
Ranch Hand

Joined: Mar 14, 2010
Posts: 46
http://forums.netbeans.org/post-47321.html&highlight=

other people have had the same problems....
Jack Bush
Ranch Hand

Joined: Oct 20, 2006
Posts: 235
Hi Krum,

Thank you for such detail explanation on the questions posted so far.

Krum Bakalsky wrote:

( c ) > You can achieve a persistence context that will outlive the JTA transactions by two approaches:
> using EXTENDED persistence context with a stateful session bean;
> using application-managed persistence context (in JavaEE environment you can obtain such a one
through a @PersistenceUnit annotation).

Looks like the @PersistenceUnit annotation is better than EXTENDED persistence context which is only available to stateful session bean.

( d ) > However, BMT is inevitable if you want, for instance, to trigger a transaction from the
> web tier. CMT is only for EJBs. So there are scenarios when the developer to use a BMT
> is the only possible way.
> There is no problem not to start a transaction from the client. But you will have to explain your goals if you want to be advised which approach is more suitable.

What is the purpose of triggering a transaction from web tier/client? Will JTA be supported in BMT in EJB 3.1 or later? My goal is to spread the load of low volume persistence transactions & ad-hoc reporting across multiple Intel hardware. It appears that JTA is sufficient for my requirement, especially for someone who is not familiar with JTA API.

( f ) > The deploy of the app client is complaining that most probably you have messed up the business interface types. Check them once again.
(If your app client lives in a separate JVM, then you must use remote interfaces only)

I have forgotten to mentioned from previous response that I have replaced CustomerRemote with CustomerLocal interface just to demonstrate that app client cannot lookup local interface, since it belonged to a separate container even though the whole Netbeans EAR (EJB & app client) project are in the same JVM, as stated by Java EE 5 & Glassfish FAQ. What is interesting is that it is not necessary to use the Main class in app client (I didn't) but got the same outcomes on Glassfish. ie can reference remote interface but not local one.

It is ashamed that Jave EE 5 doesn't allow implementation of both remote & local interfaces with the same method. Not sure why this was done and whether it will be changed in the future?

> Take a look at this: http://stackoverflow.com/questions/848675/ejb-annotation-in-clients

I can't quite follow what it is trying to do on a quick glance. It is hard to tell whether the app client was created properly either.

> Are you sure that you have declared correctly your client code as a valid application client ? (maybe this is done through some deployment descriptors, i don't know ) If the container does not recognize it as an app client, then dependency injection is forbidden and you must use look up instead.

There is nothing wrong with the app client generated from a Java EE 5 (EAR) project in Netbeans 6.7 on Windows XP. I have been able to create numerous working projects using this method.

> http://javahowto.blogspot.com/2009/10/sample-appli...ation-clientxml-java-ee-5.html

I only use EJB 3.0 annotation instead of application-client.xml deployment descriptor.

> Why don't you try to look up the remote interface in the client anyways, instead of having it injected ?

There is no point since I haven't define it.

> http://forums.netbeans.org/post-47321.html&highlight=

The code looks fine but cannot tell how this project was created but don't believe that it is a Netbeans issue since I am very pleased with 6.7 on Windows though.

( j ) > The only problem will come when a transaction is rolled back, since it will detach your entities:
> If there is a new entity that uses automatic primary key generation, there may be a
> primary key value assigned to the detached entity. If this primary key was generated
> from a database sequence or table, the operation to generate the number may have
> been rolled back with the transaction. This means that the same sequence number
> could be given out again to a different object. Clear the primary key before attempting
> to persist the entity again, and do not rely on the primary key value in the detached
entity.

I am using primary-key class/composite keys as mentioned in ( i ) which does not support automatic primary key generation according to my understanding. This means there would be no chance of the database allocating primary key from recently detached objects in an IDENTITY strategy. Is this correct?


> Something to add about the IDENTITY strategy:
> Another difference, hinted at earlier, between using IDENTITY and other id generation
> strategies is that the identifier will not be accessible until after the insert has occurred. While no
> guarantee is made as to the accessibility of the identifier before the transaction has completed,
> it is at least possible for other types of generation to eagerly allocate the identifier, but when
> using identity, it is the action of inserting that causes the identifier to be generated. It would be
> impossible for the identifier to be available before the entity is inserted into the database, and
> because insertion of entities is most often deferred until commit time, the identifier would not
> be available until after the transaction has been committed.


Not an issue for me. Nevertheless, I am wondering whether you could elaborate the pros & cons of each of these strategies.

Thanks,

Jack
Jack Bush
Ranch Hand

Joined: Oct 20, 2006
Posts: 235
Hi All,

It has been quite sometime since I have posted this threat and has since used the entityManager.merge() to merge existing entity with some degree of success as follows:

The code snippet for OneToMany Employee.java entity object is as follows:


The corresponding detail for ManyToOne Telephone.java entity object is:

I would like JPA to to ignore (not overwrite) existing record and only add newer unique telephone numbers. persist() does that except it is throwing the following exceptions and continually re-trying to insert the duplicate records:

Local Exception Stack:
Exception [TOPLINK-4002] (Oracle TopLink Essentials - 2.1 (Build b60e-fcs (12/23/2008))): oracle.toplink.essentials.exceptions.DatabaseException
Internal Exception: com.mysql.jdbc.exceptions.jdbc4.MySQLIntegrityConstraintViolationException: Duplicate entry Allan-Smith-20-M' for key 'PRIMARY'
Error Code: 1062
Call: INSERT INTO CorporationDB.EMPLOYEE (FIRSTNAME, SURNAME, AGE, SEX.....) VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)
bind => [Allan, Smith, 20, M,....]
Query: InsertObjectQuery(finance.Employee@1d2d80c)
JTS5054: Unexpected error occurred in after completion


On the other hand, entityManager.merge() ignores (does not add) duplicate Employee record (good) but it still duplicates the ManyToOne TELEPHONE table. Is it because of the OneToMany(cascade={CascadeType.ALL}} property in Employee.java which duplicates TELEPHONE records automatically? As a result, how to ensure that only unique OneToMany (EMPLOYEE) & ManyToOne (TELEPHONE) records are added once? Currently, only one entityManager.persist() is used to add both the EMPLOYEE & TELEPHONE entities unidirectionally.

entityManager.persist() seems to work fine except that it is continually erroring out instead of skipping to the next new record.

This Java EE 5 application is running properly.

I am running JDK1.6.0_7, GF2.1 and MySQL on Windows XP.

Your advice would be much appreciated.

Many thanks,

Jack



 
 
subject: How to avoid duplicate inserts in JPA HQL (EJB 3.0 - Java EE 5)