aspose file tools*
The moose likes Performance and the fly likes Keeping 4 lakhs db records data into a array list of plain java bean or VO objects Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login
JavaRanch » Java Forums » Java » Performance
Bookmark "Keeping 4 lakhs db records data into a array list of plain java bean or VO objects" Watch "Keeping 4 lakhs db records data into a array list of plain java bean or VO objects" New topic
Author

Keeping 4 lakhs db records data into a array list of plain java bean or VO objects

venkata vinay bysani
Greenhorn

Joined: Feb 09, 2010
Posts: 25
Dear Friends,

Is it a good way to keep all 4 lakhs records from db into a array list. Will it affects the performance. My requirement is i need to migrate that data to a content repository. So i need to send them one by one by accessing those plain java bean or VO's present in the array list.

Please let me know what is the efficeint way, whether to read 1000 records once process them and then again read another 1000 or i can do all 4 lakhs at a time. Which way will me more performance oriented.

Please help me ASAP Thanks in advance.
Deepak Bala
Bartender

Joined: Feb 24, 2006
Posts: 6662
    
    5

Read them X records at a time and process them. 400,000 records processed into an arraylist can become bulky and inefficient. Trying to gulp down all the data at once is usually not the way to go


SCJP 6 articles - SCJP 5/6 mock exams - More SCJP Mocks
Nitesh Kant
Bartender

Joined: Feb 25, 2007
Posts: 1638

venkata vinay bysani wrote:Which way will me more performance oriented.


Performance is very subjective. It depends on what kind of processing you are doing.
If you have to iterate over the list processing each record one by one and you do not have memory constraints then nothing like keeping it simple.
However, if you have to search elements in the list and massage data then obviously a list with half a million records is an absolute no.


apigee, a better way to API!
Yosi Hendarsjah
Ranch Hand

Joined: Oct 02, 2003
Posts: 164
venkata vinay bysani wrote:
Is it a good way to keep all 4 lakhs records

Please use English words. Not all the members of this forum know what lakh is.
Raghavendra Venkata
Greenhorn

Joined: Oct 17, 2010
Posts: 1
Hi Vinay,

why cant you have a [code] PL/SQL [/code] procedure block that does this processing? => Option1
(because db is intelligent enough to handle the performance-again depends on hw smart your pl/sql block is written

always remember being a developer one should not interact with your db directly => use any orm(for instance use
Hibernate/Ibatis/..because they take care of performance to some extent..) to interact with your db => option 2

one thing i would like to know.? why do you need access to 4000 db records ?

finally performance is a guideline , not a target.
Paul Sturrock
Bartender

Joined: Apr 14, 2004
Posts: 10336


why cant you have a procedure block that does this processing? => Option1
(because db is intelligent enough to handle the performance-again depends on hw smart your pl/sql block is written

True, though the down side of this approach is you push work into the database layer that can be expensive to distribute, instead of the application layer that can be cheap to distribute.


always remember being a developer one should not interact with your db directly => use any orm(for instance use
Hibernate/Ibatis/..because they take care of performance to some extent..) to interact with your db => option 2

Using an ORM for such large bulk operations is a bad idea. ORMs are not suitable for bulk data manipulation.


JavaRanch FAQ HowToAskQuestionsOnJavaRanch
Ankit Kale
Greenhorn

Joined: Sep 01, 2010
Posts: 1
Processing n records at a time and then purging it to DB is a advised ,since holding so much of data in memory may eat up all the available memory and leads to application / server crash down.One thing when ever this type of processing is performed make sure its in off peak loads...


Regards,

Ankit Khare
Associate Consultant
Oracle Financial Software Services Ltd.
jatan bhavsar
Ranch Hand

Joined: Jul 23, 2008
Posts: 297

hi All,

It's very old thread but just curious to know how can i increase the performance with the 4,00,000 records.Suppose i am using joins to search the record and multiple user is requesting the same.So how can i implement the caching mechanism which will be efficient. I am not using ORM.

Regards
Jatan
Winston Gutkowski
Bartender

Joined: Mar 17, 2011
Posts: 8427
    
  23

jatan bhavsar wrote:It's very old thread but just curious to know how can i increase the performance with the 4,00,000 records.Suppose i am using joins to search the record and multiple user is requesting the same.So how can i implement the caching mechanism which will be efficient. I am not using ORM.

Simple rule: Almost invariably, do all the complex stuff in the database, not in Java. It's exactly what they were designed for.

I'm afraid this is a Java forum, not a database one; but just off the top of my head: If you find you're having to do tons of joins or inefficient searches, it's usually an indication that:
(a) Your design needs an overhaul.
(b) (Ugh) Your database needs an overhaul.
(c) (Easiest, but possibly a 'band-aid') Your db needs some new - or different - indexes.
If your db has an EXPLAIN PLAN statement, it may help you analyse what's going on to help you make that decision; but I'd definitely tackle point (a) first.
Do you really need all these joins?

Winston


Isn't it funny how there's always time and money enough to do it WRONG?
Articles by Winston can be found here
chris webster
Bartender

Joined: Mar 01, 2009
Posts: 1873
    
  16

jatan bhavsar wrote:hi All,

It's very old thread but just curious to know how can i increase the performance with the 4,00,000 records.Suppose i am using joins to search the record and multiple user is requesting the same.So how can i implement the caching mechanism which will be efficient. I am not using ORM.

Regards
Jatan

Winston's covered the key points, but it sounds like you need to re-think your query strategy.

First, build and test your SQL in the database, so you can explore issues like performance without all the overhead of Java-DB interfaces etc. Make your SQL as efficient as you can within the database - look at all those joins and see if they are really necessary or if you could tweak your table structures to make them easier e.g. denormalise commonly used columns from parent tables into the children (but remember you will need to maintain these denormalised values when the parent record changes).

If you have lots of similar queries, consider defining a database view to encapsulate the common joins etc, then your individual queries against the view will be simpler. but you still need to make sure the underlying view SQL is as efficient as possible. Also, make sure you do NOT do lots of DML across the network: fetching large volumes of data and then writing it back again is very inefficient. Instead, do big INSERTs/UPDATEs as single SQL statements within the database where possible, or use stored procedures if necessary.

Make sure you use all the available database tools to make your queries as efficient as possible - EXPLAIN PLAN, use of indexes, denormalisation, partitioning, etc. Partitioning might make a big difference if your users tend to query against particular logical sets of data e.g. for a particular month, as you can partition your table on the relevant column so that each month's data (for example) is physically separate, so any queries for e.g. January only have to search through January's partition instead of the whole table. This will depend on your RDBMS as these features are usually platform-specific.

If you don't know how to do all this stuff, get your DBA to help. If they don't know how to do this either, you shouldn't be running an application with millions of rows in the first place.

Now you have some SQL queries and you want to run them from your Java application. Again, make sure you do everything you can to make this as efficient as possible. Make sure you use PreparedStatement and bind variables to make your SQL both secure and re-usable (the SQL engine can re-use parsed SQL with bind variables), use batches etc if appropriate. Make sure you apply as much filtering as possible within the database before you fetch any records back. Reading data out of the database across the network is slow, so only read the data you really want - don't fetch everything and try to filter the data in Java.

Caching query results is a problem - you'll need some proper Java experts to advise on that (i.e. not me!) - but remember that the DB has its own caching mechanisms as well. Again, talk to your DBA to make sure you are using these effectively.

Use a profiling tool to find out where all the time is being spent. No point trying to speed up your SQL queries if most of the time is being wasted on network traffic, for example.

If your queries are still too slow, consider separating the "read" data from the "write" data. Your data model may be optimised for writing transaction data into a particular set of tables, but these may not be best structure for querying large volumes of data later on. You could look at building snapshots (in Oracle these are "materialised views") to hold commonly queried data in a more suitable structure, then query against the snaphots instead of the data tables, and refresh these snapshots e.g. once a day from your operational (transaction) data store. This approach is similar to the basic principle for data warehouses/data marts (on a much larger scale), but you can take the same approach on a smaller scale if it will help your application. This will also depend on your query requirements etc.

Finally, maybe think about whether you might be able to use some of the performance features of an ORM on your application.






No more Blub for me, thank you, Vicar.
 
I agree. Here's the link: http://aspose.com/file-tools
 
subject: Keeping 4 lakhs db records data into a array list of plain java bean or VO objects