aspose file tools*
The moose likes Object Relational Mapping and the fly likes JPA @GeneratedValue killed by MySQL bug 199 Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login
JavaRanch » Java Forums » Databases » Object Relational Mapping
Bookmark "JPA @GeneratedValue killed by MySQL bug 199" Watch "JPA @GeneratedValue killed by MySQL bug 199" New topic
Author

JPA @GeneratedValue killed by MySQL bug 199

Per Lindberg
Ranch Hand

Joined: Jan 17, 2008
Posts: 48
A canonical design is to provide each Entity with a unique id and create the id column with AUTO_INCREMENT.

But this isn't safe with MySQL, since it doesn't store the next free id, just re-calculates it at restart as (highest + 1). So if you delete the latest entity, restart MySQL and then create a new entity, the id will be re-used. This is a bug in MySQL (#199) that has been known since at least 2003. The MySQL folks appears to have just documented the behaviour and then ignored it.

This can have far-reaching and catastrophic consequences, since id:s are not guaranteed to be unique. (Think "data warehouse with snowflake schema").

Is there a nice way to tell JPA2 to circumvent this horror? For example by using a dedicated table for 'next free id:s'? Or something?

Here's the way I do it now:

@Id
@Column(name="id")
@GeneratedValue(strategy = GenerationType.IDENTITY)
public int getId() { return id; }
public void setId(int id) { this.id = id; }

CREATE TABLE IF NOT EXISTS foo (
id INT(10) UNSIGNED NOT NULL PRIMARY KEY AUTO_INCREMENT,
...
) ENGINE=InnoDB;

James Sutherland
Ranch Hand

Joined: Oct 01, 2007
Posts: 553
Don't use identity sequencing. Use TABLE sequencing.

I would never recommend IDENTITY sequencing, it does not support pre-allocation, which impacts your code and performance.

See,
http://en.wikibooks.org/wiki/Java_Persistence/Identity_and_Sequencing#Sequencing

TopLink : EclipseLink : Book:Java Persistence : Blog:Java Persistence Performance
Per Lindberg
Ranch Hand

Joined: Jan 17, 2008
Posts: 48
Thanks, that's the kind of answer I was hoping for!
However, sometimes I do mass updates with millions of new records. The note about potential concurrency problems in the Wikibooks article sounds scary. Perhaps a separate id table for each id is needed for such cases...?
James Sutherland
Ranch Hand

Joined: Oct 01, 2007
Posts: 553
If you are using EclipseLink, you can use a sequence connection pool to avoid any possible concurrency issues with the sequence table (sequence access will always be in its own transaction).
Also use a large sequence pre-allocation size.
Per Lindberg
Ranch Hand

Joined: Jan 17, 2008
Posts: 48
James Sutherland wrote:If you are using EclipseLink, you can use a sequence connection pool...

Does this mean that I must set up both another Persistence Unit (in persistence.xml) and another JDBC Connection pool (using asadmin add-resources foo.xml with an extra <jdbc-connection-pool non-transactional-connections="true" ...> plus an extra <jdbc-resource>)?

James Sutherland wrote:Also use a large sequence pre-allocation size.

Right, at least for tables that gets losts of inserts (like my case with millions of inserts). However, the EclipseLink default value of allocationSize appears to be 50, which even with millions of inserts at a time would give a negligable overhead of 1:50 (2%), I guess. And in the case of tables with few and seldom inserts, allocationSize=1 seems nice to me, otherwise each redeploy or restart would bump the next id with 50.
James Sutherland
Ranch Hand

Joined: Oct 01, 2007
Posts: 553
For a sequence connection pool you don't need another persistence unit. You do need a non-JTA data-source, but most servers allow you to use the same connection pool for JTA and non-JTA.

For the pre-allocation, a bigger size is recommend if you do lost of inserts. Note that for IDENTITY sequencing the id must be selected on every insert, so your ratio of 1:50 would be 50:50 for IDENTITY, TABLE sequencing is much more efficient.

Not sure why you would be worried about losing 50 sequence numbers on a restart?
Per Lindberg
Ranch Hand

Joined: Jan 17, 2008
Posts: 48
James Sutherland wrote:For a sequence connection pool you don't need another persistence unit. You do need a non-JTA data-source, but most servers allow you to use the same connection pool for JTA and non-JTA.


Aha, ok. So in the example below, I just have to change non-transactional-connections from "false" to "true", and that's it?

<jdbc-connection-pool
name="fpdb-pool"
datasource-classname="com.mysql.jdbc.jdbc2.optional.MysqlConnectionPoolDataSource"
res-type="javax.sql.DataSource"
non-transactional-connections="false"
allow-non-component-callers="true">
<property name="user" value="..."/>
<property name="password" value="..."/>
<property name="port" value="3306"/>
<property name="databaseName" value="fpdb"/>
<property name="serverName" value="localhost"/>
<property name="url" value="jdbc:mysql://localhost:3306/fpdb?useUnicode=true&characterEncoding=utf8&characterResultSets=utf8"/>
</jdbc-connection-pool>

<jdbc-resource pool-name="fpdb-pool" jndi-name="jdbc/fpdb" enabled="true" object-type="user"/>


James Sutherland wrote:For the pre-allocation, a bigger size is recommend if you do lost of inserts. Note that for IDENTITY sequencing the id must be selected on every insert, so your ratio of 1:50 would be 50:50 for IDENTITY, TABLE sequencing is much more efficient.


Right!

James Sutherland wrote:Not sure why you would be worried about losing 50 sequence numbers on a restart?


This could be debated at length. :-) Allow me to just say 'because of aestetics'. I like it. And it does help somewhat when troubleshooting. Also, in my case the MySQL server is restarted quite often. Sure, with tables that gets zillions of updates at a time, that's another matter entirely. No argument there.

Bill Gorder
Bartender

Joined: Mar 07, 2010
Posts: 1676
    
    7

Have a read of this blog. I thought it was pretty good.

http://blog.eyallupu.com/2011/01/hibernatejpa-identity-generators.html


[How To Ask Questions][Read before you PM me]
Per Lindberg
Ranch Hand

Joined: Jan 17, 2008
Posts: 48
James Sutherland wrote:For a sequence connection pool you don't need another persistence unit. You do need a non-JTA data-source, but most servers allow you to use the same connection pool for JTA and non-JTA.


Per Lindberg wrote:
Aha, ok. So in the example below, I just have to change non-transactional-connections from "false" to "true", and that's it?


Hmm...
The Wikibooks article (under 'Concurrency and deadlocks') says that:

"...if you use a JTA data-source connection, it is important to also include a non-JTA data-source connection in your persistence.xml."

So, don't I need both a changed jdbc-connection-pool (with non-transactional-connections set to true) and an extra
<non-jta-data-source>jdbc/whatever</non-jta-data-source>under <persistenced-unit name="blah"> in persistence.xml?
Per Lindberg
Ranch Hand

Joined: Jan 17, 2008
Posts: 48
Curiouser and curiouser!

You're right that I don't need another Persistence Unit, Glassfish will automagically create one extra for me:
http://www.java.net/node/676242

Now, according to the EclipseLink FAQ, you must "Ensure a sequence connection pool is being used if using TABLE sequencing":
(http://wiki.eclipse.org/EclipseLink/FAQ/JPA#How_to_diagnose_and_resolve_hangs_and_deadlocks.3F)

and you must also configure that special extra separate connection pool as a eclipselink.connection-pool.sequence.

(http://www.eclipse.org/eclipselink/api/2.3/org/eclipse/persistence/config/PersistenceUnitProperties.html#CONNECTION_POOL_SEQUENCE)

Or... should I just reconfigure my existing connection pool...?
James Sutherland
Ranch Hand

Joined: Oct 01, 2007
Posts: 553
Yes, you should configure a sequence connection pool that uses a non-JTA DataSource.
Per Lindberg
Ranch Hand

Joined: Jan 17, 2008
Posts: 48
James Sutherland wrote:Yes, you should configure a sequence connection pool that uses a non-JTA DataSource.


Yes, but do you mean a an extra separate connection pool, or could I just re-configure my existing connection pool to be non-transactional-connections="true"?
James Sutherland
Ranch Hand

Joined: Oct 01, 2007
Posts: 553
That depends on your server. I know in WebLogic you can, I'm not sure on others.

You could have a dedicated connection pool for sequencing, it would not need many connections in it if you are using preallocation.
Per Lindberg
Ranch Hand

Joined: Jan 17, 2008
Posts: 48
I have now received a good reply from the GlassFish Server Documentation Team and Engineering Team:

"GlassFish Server always provides both a transactional version (JTADataSource) and a nontransactional version (NonJTADataSource) of a data source to a JPA provider. Typically, users specify only JTADataSource and GlassFish Server uses an internal API to obtain a NonJTADataSource version of it.

It should be OK to just change the jdbc-connection pool attribute non-transactional-connections to "false", which is the default value for this attribute. "

So your guess is correct for Glassfish too, James. Thanks!
 
I agree. Here's the link: http://aspose.com/file-tools
 
subject: JPA @GeneratedValue killed by MySQL bug 199