I want a solution that performs in microseconds. the java.io.RandomAccessFile API is what I am thinking of using; reads and writes are in microseconds. Database operations are expensive in milliseconds
Is the only requirement that the collection should survive a crash? Why not serialize the collection and read it once at load? Probably you can use Externalizable to modify how you serialize your collection. Is it so that some other process can also modify the file? If not, then every read should not translate to an I/O operation(Read once and cache), whereas every write should be. Does this help.
That article Joe linked on indexed files is neat. Reminds me of mainframe VSAM, but there I think the index was distributed in the, um, Control Areas and Control Intervals. Puts could be very fast except when one of those areas got full and required a split. I still have some routines that compute optimal free space based on record sizes, insert rates and mean time between reorg.
I'd worry about schemes like this having occasional performance hits that could make things feel uneven. And it won't be long before your code is complex enough to be slower than a real database.
Is there any opportunity to update memory in real time and update the file store asynchronously? If recovery from file is rare, say only after an uncommon disaster, consider a transaction log file that you could re-apply to the last good backup. That could be a very fast append-only write.
A good question is never answered. It is not a bolt to be tightened into place but a seed to be planted and to bear more seed toward the hope of greening the landscape of the idea. John Ciardi
Raees, What are the other requirements on the collection? Is it likely to be 10 items or 10,000 items? Or 10,000,000,000? Will the contents be homogeneous or heterogeneous? Will the items in the collection be changed often? Will others access the backing file? For update or read? Concurrently?
More details are needed to make your implementation decision.
Bill Shirley - bshirley - frazerbilt.com
if (Posts < 30) you.read( JavaRanchFAQ);
Answering many questions in this post. Yes we used tangosol caching product. It is excellent for in memory caching; but not so good while synchronizing to disk. I will look at the JCS product.
The next answer is that we want write up to 10M to a file. not more than that cos of the performance hit we have observed on 10M or more sized files. with 100 bytes per record. this will hold aprox 100,000 recs.
No the content of a record will not be updated ; but insert and delete of records will be occurring for 100% for the records. the content will be homogeneous; and other threads in the JVM should be able to access the content ( need to be sync) .
yes we also don't want to go beyond this 100,000 per collection on a file. As stated earlier this file based collection is an intermediate storage for records until they get pushed to a database. This arrangement is merely for performance/latency when we have bursts of records coming in. A series threads will read this collection and persist to database and remove from the collection
If the events exceed this number; then we will bypass directly to db taking the performance hit. I believe this won't happen ever cos .. we will have enough threads that reads from the file and batch updates to the database
[ December 18, 2007: Message edited by: Raees Uzhunnan ] [ December 18, 2007: Message edited by: Raees Uzhunnan ]
You might want to take a look at http://www.prevayler.org for an alternative persistence solution. I don't have any experience with it, but it sounds interesting, and like something that at least might give you some new ideas.
The soul is dyed the color of its thoughts. Think only on those things that are in line with your principles and can bear the light of day. The content of your character is your choice. Day by day, what you do is who you become. Your integrity is your destiny - it is the light that guides your way. - Heraclitus