Forgive me if this is not the right forum for this post - please direct me to appropriate forum and I'll post there.
We have a need for an off heap HashMap<
String, Object> that can potentially store an unlimited number of entries. The value Object to be stored is variable in size. The HashMap is needed only for the duration of processing an input file. It does not need to be persisted.
We can make our current design (a tree based data structure) work with a 64 bit JVM, a lot of physical memory, and heap at least 8G, but many of the target environments don't have these characteristics. For some
test cases we're still getting killed by garbage collection.
I think a hashmap backed by a memory mapped file would possibly be a good solution but haven't found one that works yet. As an added point of information, we're looking for a relatively lightweight solution - just adding a few jar files to the application rather than depending on something as complex as Hadoop or Apache Ignite.
Here's what we've looked at, and why each hasn't worked:
EhCache - fine, up to a point. Very large input requires a TerraCotta license and there are issues around this (redistribution)
MapDb - Version 1.0.x was mostly OK, but for the more extreme test cases we were running it still sucked up most of the machine's memory (memory leaks I assume) and was not acceptable. Version 2 had performance degradation, and V3 isn't ready yet.
openHFT ChronicleMap (and predecessors) - limited to approximately 4GB on Windows, and must be sized before creating (i.e., not unlimited)
clapper.org FileHashMap - Very low memory profile, unlimited size, but slow. Also got killed by garbage collection with large test cases
JCS - Would not support large test case. Also, is designed more as a caching solution than what we need - there's a chance that older data will be pushed out of the hashmap. Not acceptable.
OrientDb - very slow in initial testing
Does anyone have any suggestions for an open source solution for this problem? We really don't want to write our own implementation, but if we can't find something suitable we may have to go that route.
Thanks!