This week's book giveaway is in the OO, Patterns, UML and Refactoring forum. We're giving away four copies of Refactoring for Software Design Smells: Managing Technical Debt and have Girish Suryanarayana, Ganesh Samarthyam & Tushar Sharma on-line! See this thread for details.
It is likely that your biggest time sink is the network transfer of those million objects. Or running out of memory. Do you have enough memory allocated to your JVM to hold all this data?
What are you doing with the data? If writing it to a file, you don't need to hold it all in memory at once. Get a batch, write to file, get another batch, etc. And if showing to the user, you certainly don't need it all in memory. That is what paging is for!
While starting up the server ( our own TCP \IP server) , the data from some tables are fetched and it will be stored in the list \vector.
It will act as a virtual table thought out the user session. So the application will not hit DB after server initialization.
I know there are better implementation available but we need to use this approach only as the architecture demands (it is a pretty old application) .
We dont have time to re implement this using JBoss and EJB.
It is a java swing application (coded in 2002).
I'd assume that the server is started and then runs for days, so the 90 second startup should not matter that much. I mean, it's still under two minutes!
You should also carefully think about storing data in a list. If you search records in a list by simply iterating over it, then a DB call might actually be faster, if it used an index to get the record. I hope you'll be using at least binary search. Perhaps you should use a HashMap, though.
There is often a low hanging fruit - the fetch size (see java.sql.Statement#setFetchSize(int)). It is possible you're using a fetch size which is too low. Larger fetch sizes obviously result in fewer database calls, but can have other undesired effects. A fetch size between 100 - 500 might work best. You should experiment with various fetch sizes (even outside that range) to see which gives you best results.
If this does not help, you need to do the hard work - find the bottleneck. It could be the network, or the the query performance itself, or perhaps the memory in the server (hopefully not, as if this was the case, the server wouldn't have much space to do anything other than keep the data ). Or it could be even the IO in the database.
If it is a complex query, perhaps it can be tuned for performance. You might gauge this by storing the results of this query to a table and modifying the code to fetch records (as simple select *) from this table. This could give you some idea of how much time you can save by optimizing the query. (Note: it does not matter that the stmt.executeQuery( sql ); returns in less than a second. The database retrieves the data as they are needed (fetched by the client), and for this amount of data, this is where the bulk of the processing happens.)
You could also try to create a million of mock objects to see whether this is the problem; fetch just one representative record from the database and create million instances from it, without calling next().
Some databases/JDBC drivers offer tracing and diagnostic tools, these might reveal whether the problem lies in the network.
I’ve looked at a lot of different solutions, and in my humble opinion Aspose is the way to go. Here’s the link: http://aspose.com