aspose file tools*
The moose likes EJB and other Java EE Technologies and the fly likes Object replication Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login
JavaRanch » Java Forums » Java » EJB and other Java EE Technologies
Bookmark "Object replication" Watch "Object replication" New topic
Author

Object replication

Balamaci Serban
Ranch Hand

Joined: Mar 16, 2005
Posts: 49
Was wandering if anyone has any ideea how he would procede in doing this: You have an entity bean(more even), and when you want to persist it to storage, not only you would persist it in one database but in two. And the fun part is the second database is not on the same computer or a local network it's over the internet, actualy there are a multitude of databases(to give an example: information kiosks that are conected to a master database through internet and each with an aplication server that builds a web page by reading products in a local database that of course needs to be updated when the main database is updated with new products-the ideea is not to direct the read all maybe simultaneous requests to the main database as it would choke the connection(bandwidth)- but only write operations when say a client wants to register and the write goes to the main database so that we do not get any conflict updates).

Well a perfect candidate for database replication one would say, but the problem is that the main database site is Oracle and the satelites are MySql.
Worst problem- the satelites might go down sometimes- and when they comeback up again they need to resync.

Yeah i know it's a total bitch.
Balamaci Serban
Ranch Hand

Joined: Mar 16, 2005
Posts: 49
Well my first thoughts:

Actually i'm quite a novice in EJBs and application servers, so i'm gonna take it slow although this is my grad project and have only one month left to hand it down so i'm not gonna sleep anymore.
First thing that popped into my mind was hey, this are components. When a client writes to the EJB, i will simply act in this storage method as a client that does the same but on the EJBs on the other servers.
Yeah, that was easy and i quickly congratulated myself for this fast thinking. Well WRONG. After reading some more stuff on EJBs it turns out that you can't spawn any threads inside an EJB. So my dreams of sending concurent requests to the other EJBs so they would make the changes well, it kinda crumbled. Because: every EJB has to wait for the one in front of them to finish talking to it-and what if that one is much more slower than the rest? We delay too much. Not to talk about some of them not beeing online, then we maybe would register the change that it would have been made and would somehow rerun the invocations to that bean when it comes back online.
[ March 31, 2005: Message edited by: Balamaci Serban ]
Dave Clark
Ranch Hand

Joined: Feb 16, 2005
Posts: 52
One approach is that you could use JDO rather than Entity Beans, then you simply have 2 PersistenceManagerFactory's which are configured to talk to your 2 databases (it doesn't matter that one is on a LAN and the other over the 'net).

When you want to update the 2 databases, you simply ask both factories for an instance of a PersistenceManager, then ask each of the PersistenceManagers to store your object. You should do this within a transaction, which can be container managed if you do your persisting from within a Stateless Session Bean. The only other thing you need is XA-compliant database drivers for your 2 databases configured for your datasources used by JDO, and they'll work together for a 2-phase commit transaction.

It's not really possibly to do this with Entity Beans, since they don't use PersistenceManagers - you'd have to have 2 separate Entity Bean classes and 2 seperate mappings etc. Certainly not as clean a solution.

If you're really worried about network latency over the internet (I'd only worry if you measure the time and it's actually a problem), then you'll want to send a message to a JMS queue on one side and an MDB listener on the other side. That way you can write to the queue within the same transaction as your local DB update, but still guarantee that the remote DB will also get updated.

cheers,

Dave.


Dave Clark<br />Senior WebSphere Architect<br /><a href="http://www.versant.com" target="_blank" rel="nofollow">Versant Open Access - JDO2 & EJB3</a>
Balamaci Serban
Ranch Hand

Joined: Mar 16, 2005
Posts: 49
Second thought:
Well after some digging for a golden solution, well in my search i stuble on something too good to be true, or at least something that i must have missunderstood cause there would be no reason not to be a whole lot more tutorials on the subject. Well the secret(and cover it up so none else could see it) is JMS. The good thing about JMS is that it guarantees a message to reach at some point in time once and only once to an address. Even if that address is not online, the message would be kept safe in the database till the receiver of the message comes back online. Not only that, but JMS says that you can write to a "Topic" and a message sent to that topic would reach all the clients registered to that topic. So it's a 1:N relantionship.
So when in ejbStore method, after writing to the database, we write to the topic. The message would then be automaticaly redirected to the subscribers and everyone is happy, me even more cause even if a client is down he will receive that message.

The joy last only for minutes cause not all clients could be in the same moment of replication:
I mean let's consider that we have installed a fresh new node. Well this node must first get all the data that the primary site already contains. This in database replication terms is called a SNAPSHOT.
Now say that we have a method called takeSnapshot() in the entity bean which it returns a vector of all the records. We have to call this method when the new node starts(that is a problem i did not solve yet but if nothing is found to work we would have to start a program outside of the aplication server and RMI call the takeSnapshot method from the primary server.

Now the problem is what should that method do? If it's like a getter it returns a coolection of all the records of the entity bean, well ok, after it returns all the records, we write them to this secondary database and afterwards we attach ourselvs to the JMS topic to listen for new changes to the data. Well, but what if there were changes to the primary database when we were transmiting the data to this new site, or say when we were writing the information to the database? I mean in the time we ran takeSnapshot() and ataching to the JMS topic-they would be lost-?

Second aproach it comes to mind, that we do not directly return the object in takeSnapshot(), but we create a JMS queue, which means that the message is sent only to one destination. So we write to that queue all the SQL messages that compose the database till that point. But at same time or imediatly after we call takeSnapshot() we attach ourselvs to the topic as well. When we begin to see messages alike that come from the queue and from the topic as well it means that the two are finally sincronised and we can let the primary know to drop the queue. A problem would be what will we do with the messages that come from the topic all this time and that we cannot yet commit to the database as we did not have yet obtained the state in wich the primary database was before the snapshot.
Well please let me know if u have a better solution or if u find any problems with this hole new ideea. Transactions at this time kinda are not an issue.
[ March 17, 2005: Message edited by: Balamaci Serban ]
Balamaci Serban
Ranch Hand

Joined: Mar 16, 2005
Posts: 49
Thanks Dave, i was writting the second post wich took of a long time to get the ideeas "on paper" when u posted, and that's why i did not reply to you in the 3rd post.
Well transactions and 2Phase Comit did get into my mind, but the problem would be that, if we have more that 2, and even if we have only 2 databases, the 2PC protocol says that they must all(both) agree or the transaction fails(and will fail if the second site would be down) and multiplied that N times, well it would not go very well. And the time it would take to all agree would be semnificant(i don't know at wich level the locking on the records is done-maybe table even?-) And I'm pretty sure that MySql is not XA compliant(although that would not be a problem since i can say that i use PostgreSql).

JDO to my shame i never heard of it, and I will have to look into, but i have confidence that i'd get up to speed on that(although i have to look more into JMS first ) my main concern till this grad project was linux, php, C++ and simple plain java. This J2EE java stuff is really hard to swallow and i haven't even got into AOP

Well on the 2nd solution i'm glad that someone else had thought of JMS and that the solution almost converges.

How about for simplicity sake i build a queue, for every node to be replicated to. When the snapshot is requested, it fills up the queue and for every update it writes to all the queues. But for performance sake, i guess that would be a hit bellow the acceptable belt. Not say that the main database in case of the message queue having to be persisted, would grown Nth times.
Hm...
[ March 17, 2005: Message edited by: Balamaci Serban ]
Dave Clark
Ranch Hand

Joined: Feb 16, 2005
Posts: 52
JDO = Java Data Objects - It's basically lightweight persistence for Plain Old Java Objects, rather than heavyweight components like Entity Beans that need to run inside an EJB container. A good starting point for learning about JDO is www.jdocentral.com

If you want to use JMS (again, I wouldn't worry about network latency unless you measure it to be a problem - lots of web services for instance, run synchronously over the web), you may want to have a JMS server on your client, and another JMS server on your remote DB server, and cluster them, so that messages sent to your local JMS server will automatically be propogated to your remote JMS server, and both your client and remote database can talk to a local queue,

cheers,

Dave.
Balamaci Serban
Ranch Hand

Joined: Mar 16, 2005
Posts: 49
Well from the quick reading i've done through JDO, it seems they were an alternative to EJB beeing as u've said more lightweight, and that they are reunited in EJB3, wich from the specs i've seens are really nice to work with but i can't figure out if u could tap into the function in wich they actualy are persisted. Maybe i'll leave them for tomorrow.
Also for anyone interested in JMS, i put toghether what seems some nice resources to look into:

http://docs.jboss.org/jbossas/jboss4guide/r1/html/ch6.chapt.html
http://e-docs.bea.com/wls/docs61/jms/index.html
http://java.sun.com/products/jms/tutorial/1_3_1-fcs/doc/jms_tutorialTOC.html

Yeah about the cluster, hmm not sure how many open project solutions go for that, maybe JBossMQ can do that, but why replicate the JMS, i mean we would replicate queues that we would have used to send messages to the others nodes, we would replicate them unnecesary on a node that would have no use for them and moreover we would just move the problem from database replication to JMS server replication and having to conform to only a single provider.
[ March 16, 2005: Message edited by: Balamaci Serban ]
Dave Clark
Ranch Hand

Joined: Feb 16, 2005
Posts: 52
not "they were an alternative to EJB", JDO **is** an alternative to EJB.

JDO is alive and well - the JDO 2.0 JSR was just accepted by the JCP a couple of weeks ago, and there are a couple of dozen commercial and open source implementations which work very well today, and will continue to work into the EJB 3 timeframe, which is still a year or more away.

A great JDO resource is www.jdocentral.com.

btw - Versant have a great JDO implementation also (I work for Versant) - www.versant.com,

cheers,

Dave
 
Don't get me started about those stupid light bulbs.
 
subject: Object replication