*
The moose likes Object Relational Mapping and the fly likes Distributed Caching Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login


Win a copy of Android Security Essentials Live Lessons this week in the Android forum!
JavaRanch » Java Forums » Databases » Object Relational Mapping
Bookmark "Distributed Caching" Watch "Distributed Caching" New topic
Author

Distributed Caching

Anand Sid
Greenhorn

Joined: Dec 16, 2007
Posts: 7
Hi -
I am in the initial analysis phase of a project. In which we are going with ORM. We have shortlisted Toplink as the provider.

"I would like to know what strategy is best suited for handling caching for descriptors that are upadted quite often in a clustered enviroment. I.e., A change is made from server A and another server B which may be reading the same date from it cache which has stale data."

I used to work with ATG Repository in which we used Distributed JMS (SQLJMS). WHen we update the descriptor and commit it to the db. We send a JMS message to a JMS Topic which is picked up by other servers in cluster and accordingly the changed record is cache invalidated. Should I be using a similar approach here?
Mark Spritzler
ranger
Sheriff

Joined: Feb 05, 2001
Posts: 17249
    
    6

Personally, the more transactional and changing the data is, the less I would cache it at all. Because pretty soon, all your resources are going to updating all the servers with the latest data, and no processing power to support your clients, so the drag and slowness becomes more than going to the database to get your data.

Data that remains the same is the highest roi on loading data into a distributed cache.

Now there are a good number of distributable caches out there and I believe all pretty much should be able to be used with Toplink.

"We have shortlisted Toplink as the provider."

that looks more like you choose Toplink rather than shortlisted.

Mark


Perfect World Programming, LLC - Two Laptop Bag - Tube Organizer
How to Ask Questions the Smart Way FAQ
James Sutherland
Ranch Hand

Joined: Oct 01, 2007
Posts: 553
TopLink (and EclipseLink) provide several features for handling stale cached data. Which feature you choose depends on have often your data is updated and what your applications level of tolerance for stale data is.

Regardless of which solution you choose I would recommend you use an Optimistic Locking Policy to ensure that no write will occur on stale data. Even if you are not caching at all this is still important.

If you data is frequently updated, and you want to avoid any stale data, then consider turning off the cache, by using an Isolated cache in TopLink for that class. Note that in TopLink you can configure the cache type for each class, so read-only or less frequently read classes could still use caching.

If you are less stringent on avoiding stale data, you could set a Cache Invalidation Policy in TopLink to timeout stale data after a threshold of milliseconds. Again in TopLink this can be configured for each class, allowing different classes to have different thresholds.

Another option is to use refresh on your queries where you require update to date data. This can be used in combination with the TopLink descriptor setting onlyRefreshCacheIfNewerVersion() and optimistic locking to only be refreshing when the object has been updated.

If the data is less frequently updated you can use TopLink Cache Coordination in a cluster (distributed caching). TopLink supports Cache Coordination over several protocols including JMS, RMI, RMI-IIOP and CORBA. I would recommend using JMS. Be careful using this is the data is frequently updated as it may degrade performance, but this depends on how fast the connection is between the machines in the cluster versus the database. Again in TopLink this can be configured for each class, allowing some classes to use a coordinated cache, and others to use other mechanisms to handle stale data.


TopLink : EclipseLink : Book:Java Persistence : Blog:Java Persistence Performance
Anand Sid
Greenhorn

Joined: Dec 16, 2007
Posts: 7
Originally posted by Mark Spritzler:
Personally, the more transactional and changing the data is, the less I would cache it at all. Because pretty soon, all your resources are going to updating all the servers with the latest data, and no processing power to support your clients, so the drag and slowness becomes more than going to the database to get your data.

Data that remains the same is the highest roi on loading data into a distributed cache.


Thanks Mark. That�s a very good point. I think I didn�t ask the right question - For certain data which are not updated frequently but there is a chance of it being updated. Those are the possibility that I have to look at.
Anyways I don�t have any concrete scenario at this point of time. I will post if i get one in the project. Thank again.
Anand Sid
Greenhorn

Joined: Dec 16, 2007
Posts: 7
Originally posted by James Sutherland:
TopLink (and EclipseLink) provide several features for handling stale cached data. Which feature you choose depends on have often your data is updated and what your applications level of tolerance for stale data is.


Thanks James. That is a very good list you have put up.
At the end of the day as Mark says caching settings have to set based on the way the day is accessed or updated.

Let me put together one simple example (in a clustered enivironement) �
1. Let say an employee in a company creates an Order for say stationeries and sends the same for approval to his manager.
2. The manager either can approve the order for further processing or reject back with or without modifying the order.
3. In most cases if the manager rejects he would not change anything in the Order, But say in one scenario he changes the some value in the order and rejects it back to the employee.
4. In this case there is a possibility of the employee seeing the stale data which doesn�t have the details updated by the manager in a clustered environment.

Based on the ways to avoid stale data in Toplink, In this case I can either go for onlyRefreshCacheIfNewerVersion or use Cache Coordination for the Order descriptor.
(For the onlyRefreshCacheIfNewerVersion I presume that a database call will always be sent to check for the version column whenever I try to query.)
Can you tell me which will be appropriate for the above scenario?
Mark Spritzler
ranger
Sheriff

Joined: Feb 05, 2001
Posts: 17249
    
    6

Well, still not knowing much of Toplink's api, but You might be able to write code in the manager approval that evicts the data from the distributed cache.

I can give the Hibernate method call, then you would have to find the corresponding methods in Toplink.

so in Hibernate calling SessionFactory.evict(Object o) will remove data from the "distributed second level cache".

Are you using any Business Processing Modeling api like Jess or JBoss Rules?

Mark
syed aliarshad
Greenhorn

Joined: Jun 20, 2012
Posts: 11
Hello,

I guess I have the same situation here..

I am using Toplink as ORM. I have implemented caching too using cache coordination.

Below are the lines that I have added in Toplink-session.xml file for caching.

------------------------------------------------------------------------------------------------------------------------

<remote-command>
<commands>
<cache-sync>true</cache-sync>
</commands>
<transport xsi:type="jms-topic-transport">
<topic-host-url>tcp://jms.someurl.com:61616</topic-host-url>
<topic-connection-factory-name>java:comp/env/jms/CacheTopicConnectionFactory</topic-connection-factory-name>
<topic-name>java:comp/env/jms/topicname</topic-name>
<jndi-naming-service>
<initial-context-factory-name>org.apache.activemq.jndi.ActiveMQInitialContextFactory</initial-context-factory-name>
</jndi-naming-service>
</transport>
</remote-command>
------------------------------------------------------------------------------------------------------------------------


The above settings are working fine for one cache node. I want to make the JMS Cluster having more than one cache node. How can I achieve this?


Thanks
James Sutherland
Ranch Hand

Joined: Oct 01, 2007
Posts: 553
What server are you using? You just need to launch another server with the same settings.
syed aliarshad
Greenhorn

Joined: Jun 20, 2012
Posts: 11
I am using tomcat.

I have 2 tomcat nodes and 1 JMS cache node. I want to add one more JMS Cache node for JMS Clustering.

If we add one more url with comma separated in the below given line then will it work?

<topic-host-url>tcp://someurl.com:61616</topic-host-url>

Thanks
James Sutherland
Ranch Hand

Joined: Oct 01, 2007
Posts: 553
No, topic-host-url is the URL of the node hosting the JMS topic, this must be the same host for all servers in the cluster.

You should not have to do anything different on either node. They should both connect to the same topic on the same JMS host and be in communication with each other.

What JMS implementation are you using?

syed aliarshad
Greenhorn

Joined: Jun 20, 2012
Posts: 11
We have 2 tomcat nodes but they are sharing the same source code so my Toplink-sessions.xml file would be one and we are adding one more cache node that will have jms broker setup.

Structure:
======
Tomcat Nodes: 2
JMS Cache Nodes 1. Need to add one more to make a cluster.

thanks in advance
James Sutherland
Ranch Hand

Joined: Oct 01, 2007
Posts: 553
Sorry, I thought you wanted to cluster the Tomcat instances, not the JMS server.

What JMS implementation are you using? Clustering a JMS server will be dependent on your JMS implementation and if it supports this, it should not affect JPA configuration.
syed aliarshad
Greenhorn

Joined: Jun 20, 2012
Posts: 11
Sorry for the late reply.

We are using JMS 1.1 implementation. We need to provide the Failover settings like failovertcp://primary:61616,tcp://secondary:61616)?randomize=false.

Our toplink-sessions.xml file looks like this.

<transport xsi:type="jms-topic-transport">
<topic-host-url>tcp://URL:61616</topic-host-url>
<topic-connection-factory-name>java:comp/env/jms/CacheTopicConnectionFactory</topic-connection-factory-name>
<topic-name>java:comp/env/jms/toplinktopic</topic-name>
<jndi-naming-service>
<initial-context-factory-name>org.apache.activemq.jndi.ActiveMQInitialContextFactory</initial-context-factory-name>
</jndi-naming-service>
</transport>


And we need to implement Shared Nothing master/Slave strategy for HA. Could you please help on this.

Thanks
 
It is sorta covered in the JavaRanch Style Guide.
 
subject: Distributed Caching
 
Similar Threads
Spring Framework
Does EHCache support caching across multiple, distributed servers?
Concurrency and caching
clustering web servers
Synchronize cached data across EJB containers