Just wanted to know what use cases can be of MMF and stumbled upon article, which says that if one use small enough files it will waste RAM, because of page size alignment (4KB), so if you allocate 5KB it will reserver 8KB and will waste 3KB.
The questions is: should we even concern about it when we use MMP on Java? Is the defragmentation feature provided by JVM?
in my opinion there is no need at all to use memory-mapped files when you only have to process relatively small files. The reason to use memory-mapped files is to speed up file access or to access large files as if they were in-memory data. But even with small files the waste of memory is only big in contrast to the (small) file size. If you don't process thousands of these files simultaneously the wasted few KB of RAM won't be a big problem on most modern systems. Of course it's more of a problem on systems with only small RAM size.
Actually I don't know how memory-mapped files are implemented by the JVM. I guess it depends on the underlying operating system. On systems which support memory-mapped files natively the JVM will probably just use this feature from the OS. If it would be implemented in the JVM for some systems it would probably work similar because it's usually more efficient to manage memory in whole blocks instead of single bytes which will inevitably lead to some wasted memory under certain conditions. After all the wasted memory is just a trade-off between the memory consumed and performance. Memory-mapped files are used to speed things up at the cost of some wasted memory.
Depending on your requirements memory-mapped files probably could be an alternative for IPC on a single node. But in my opinion there are many other considerably more elegant solutions for IPC in the Java world. Of course it depends on your needs what would be a really good solution.
I would say performance is the only need in this situation.
Another advantage of memory-mapped files would surely be that it allows to access a very large chunk of data.
Do you mean message-based IPC, It should introduce some overhead, isn't it?
Yes, a solution based on messaging comes to mind (of course, only if it fits your needs). Some kind of MOM (message-oriented middleware) is optimized for high volumes of messages (up to millions or billions per second) and popular products like ActiveMQ allow for a lot of optimizations regarding performance, reliability, availability, guaranteed delivery etc. Additionally these tools are often capable of running in a distributed mode in a cluster of message brokers. This enables high availability and allows you to scale horizontally, i.e. to simply add more nodes to speed up performance when necessary. Of course messaging isn't a perfect fit for all situations. For example you won't use messages to share gigabytes of data in a single message. Another thing to consider is that messaging is inherently asynchronous and your application has to be designed accordingly which may or may not be an option but at least requires a different way of thinking about design.
A nice thing about ActiveMQ is that you can easily use it in embedded mode inside your application as long as you don't really need a separate message broker or a cluster of brokers.
Regarding overhead I don't see real disadvantages for a messaging-based solution. As I said these tools are highly optimized for speed and throughput and can be further tuned per configuration. If you'd implement your own kind of IPC with memory-mapped files you'd still have to read and write from or to it and manage how to access the file concurrently from different processes. This is surely no easy task and will bring some overhead, too. That doesn't mean that it is not possible to outperform existing and proved solutions with a custom solution but it may not be that easy to do and you should ask yourself if you think you can achieve this and if it's worth the trouble.
That said, I want to add that messaging of course is not the only solution to integrate different applications/processes. You should choose well depending on your needs and requirements!