• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Ron McLeod
  • Junilu Lacar
  • Paul Clapham
  • Liutauras Vilda
Sheriffs:
  • Jeanne Boyarsky
  • Rob Spoor
  • Bear Bibeault
Saloon Keepers:
  • Tim Moores
  • Tim Holloway
  • Piet Souris
  • Carey Brown
  • Stephan van Hulst
Bartenders:
  • Frits Walraven
  • fred rosenberger
  • salvin francis

Design Help

 
Ranch Hand
Posts: 113
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Hello All,

I am working on a application where I have to get some information which is global to the organization.

To get this global data there are some services designed (called as integration layer services) which my application need to call.

Problem is
- This data is being widely used in my application particularly in my core module and this is the module which end users might be accessing all the time.
- This data is huge that is it may be some thousands of records and this data is not as clean as we need. So, my application has to process the data that was returned by the integration layer service.
- In some cases to process this data, for each record to get extra information we have to make a call to the integration layer, which will be a huge overhead and can be a performance bottleneck.

Solution:
- Rehosting the data: Which is not at all preferred because of data integrity. No one wants to manage the same data in two places.
- Caching: I am not quitely sure how to do this. Cache in a
Flat file: IO read/write expensive operations.
Database: same as rehosting
In memory in application scope: I think system can go slow.
With all the above approaches, Synchronization and data integrity would be the major problem.

In such a scenario, how should I have to design my application.

Your help would be greatly appreciated

Thank you
 
s penumudi
Ranch Hand
Posts: 113
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Any ideas?
 
author
Posts: 14112
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Well, I don't know what would be the best solution. Actually I'd think that you will only know wether a solution will work appropriately after you tried it...

So the only real advice I know to give is to isolate your decision from the rest of the code. Put your caching/distribution/whatever-strategy into an isolated layer and don't expose the implementation details to the rest of the code, so that you can switch to a different strategy later on.

In fact I'd probably start with a "naive" implementation: do the simplest thing that could possibly work (direct access of the information you need, when you need it, but always through the layer), without thinking much about how it could be optimized. You might be surprised how well it works. And if you don't, at least you have something working to measure and profile to get real data to base your decisions on, and something you can inject different strategies into to experiment with.

Does that help?
 
s penumudi
Ranch Hand
Posts: 113
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Thank you Ilja Preuss,

Your suggestion gave me confidence
My current design is just straight forward. get the information process it and give it to client. As you have suggested I will have to implement it and test it to see how best or worst it can go..

I was wondering how caching works and how it takes care of synchronizing data with original data.

Thank you very much
 
Ranch Hand
Posts: 3271
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Ilja's advice is good advice. The important part is to "abstract out" the code that actually gathers the data. From your application, make calls to some "Data Abstraction Layer" (DAO) that will handle the acquisition of your data. To start with, the DAO will probably just make direct calls and might very well be a bottleneck. If it is, you can address that without having to change any of your code.

As far as caching, I think how it's implemented is really up to you. You could periodically pull down large amounts of data and store it in a local database. I know you mentioned that you didn't want to maintain the data in two places but, as long as this local cache is considered "temporary" storage, it requires very little maintenance.

Perhaps you could store the data in memory by making an object that reflects the structure of the data you're pulling down. When someone requests data, you can check to see if you already have it. If not, you can go get it. If you do, simply return it. Depending upon the amount of data you're pulling down, this could turn out to be a real memory hog. Is your application going on a dedicated machine? Will it be running in the background with a lot of other apps?

Like Ilja said, you might not know which solution is best (I'm sure there are plenty that I didn't even touch on) until you actually try it.

Going along with what Ilja said, besides isolating the data access from your code, you might want to add another layer that allows you to change between data access implementations easily (perhaps via an external file). That way, if you build the implementation one way and it doesn't work well, you could build a new implementation (without changing the original one) and easily swap between the two to do performance testing.

I hope that helps at least a little.
 
s penumudi
Ranch Hand
Posts: 113
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Thank you Corey McGlone,

It helped a lot. I will revist my design and create one more layer that does this processing.


Is your application going on a dedicated machine? Will it be running in the background with a lot of other apps?



There could be other applications running on the same machine.
This is a web application and I have to get the information and process it whenever user request for this data.

The application server is Oracle 10g AS with Clustered Environment.

Thank you very much
 
Corey McGlone
Ranch Hand
Posts: 3271
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
I had one more thought this morning while I was driving to work. If you'd like to get data into a local cache but you're concerned about hogging too much processing time, you might want to think about using a Daemon thread to process data acquistion in the background.

With a little judicious use of the processor, you could potentially create a thread that is constantly acquiring and updating data without causing significant lag in the user interface of your application (or any other applications on the machine).

It's just a thought, but I figured I'd share.
 
s penumudi
Ranch Hand
Posts: 113
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
That is great Idea. I really appreciate it.

I am not quitely sure how threads work in clustered environmen. May be I will be have to do some research on this. And the s/w could be potentially implemented using EJB and I guess it is not recommended to manage threads using EJB, I am not sure how to go about it.
 
Ranch Hand
Posts: 1873
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator
Well,

I guess that your client application that wants to use data that needs to be threaded to monitor changes so if that is a web application you can do it. If it is session beans (I would seriously doubt if it is entity beans) that needs this cache data things and needs to fork threads then your worry is right.

Also, asynch messaging could be one way to have cache notification to the client app so the CacheManager can update the cache to keep it in synch with the global data. Of course we would end up having "monitoring thread" on the global data side that "publishes" the changes periodically to "the client channel"......

Thanks
Maulin
reply
    Bookmark Topic Watch Topic
  • New Topic