We have a web application based on Java/JEE. Some of its pages are AJAX based and very similar to KAKAY.com pages to filter results on several attributes.
For e.g. if we display 1500 products and their 50 different attributes along with their values (like color, size, weight, price), user can filter on a color (which may reduce the no. of products to 1300 for e.g. and disable a lot of mismatching attribute values). User can go on to filter on other attributes step by step until he gets to the refined set of products he wants. This is done on server side with Java bitmap objects for quick calculations.
Some of these pages are really heavy on server side because bitmap size is huge (like 50 mb file on serializing bitmap objects to disk). For each user session, this 50mb is cloned to maintain state of bitmaps specific to that session. These pages with a single user perform to acceptable level. As we increase the number of users to 5-7 performance starts degrading. We tested with 7 simultaneous users accessing the same page, then making selections on same attributes. Server side time to process 1 request increases to 15 secs per session with 7 sessions from 4 secs with a single session. I profiled with JConsole yesterday and saw that, when we make the selection there are high spikes in Heap memory utilization (goes from 250mb to 450mb for e.g. and so on ). Spikes come down gradually in steps (like every 10 secs it comes down by 25 mb), which I am assuming means we do not have memory leaks or at least major memory leaks. But if we keep on pounding the system with 7 sessions and make selection over selection, performance remains degraded and spikes remain high.
1. The sudden high spikes I see in Heap memory utilization when all users make parallel selection and gradually they come down - from this my interpretation is the spikes are for allocation of "objects local to methods which are eventually collected by garbage collector in short intervals". So I should probably look into software to see why so many local objects are being created for each server side iteration (instead of long time surviving session objects or servlet context objects). Can you help me to correct/confirm my understanding?
2. I am not sure if the heap memory is used by JVM for anything else other than application objects. These high spikes - can they be JVM requirement to carry through the current set of requests for its own purpose rather than our application's objects? I know to understand this I should go a step further to profile to see which exact objects are created during spikes. But what I have collected so far, can you give me a hint or share your own thoughts/experiences?
3. Generally I see forums discussing JVM memory settings in the range of 256mb to 2gb. My question is how many parallel users can the applications like Amazon or Kayak can support with 2GB of RAM on a single machine? How are they managing this situation? I know we can bring more servers and load balance but what about per machine with 2GB max heap size, how many max parallel users all pounding the system?
I take it that your data structures are cached as a number of BitSet (or its equivalent):
To my mind, a simple implementation would be:
#1 For each user create a "selection Bitset" with number of products(=1500) bits
#2 User selects type=shoe
#3 System sets user's selection Bitset = Bitset for "type=shoe"
#4 User now selects color=g
#5 System does user's selection Bitset = (User's selection Bitset) AND (Bitset for "color=g")
So it seems one bitset is enough for each user. Why is it needed to clone the entire set of bitsets for each user, or did I misunderstand something?
System has to update other values also when user makes a selection. So if user selects color=g, and this color goes only with type=shoe and shirt, system has to disable type=mobile. To determine this system needs to AND with every bitmap (for each value) and disable what results in all 0s. Not only this, system also needs to update state of bitmaps of all values (and keep this state for future selections by same user). Suppose type=shoe goes with products 1,10,20 and 30. When user selects color=g, and it goes with type=shoe but only for products 10 and 20, bitmap of type=shoe should be ANDed to remove set bits at position 1 and 30. And this state for this user must be maintained. Next time if user makes a selection, we AND with updated bitmaps to continue forward.
There are some alternative implementations and we are considering them for long term, but as I said earlier I am looking for answers to these questions asap, if there can be any help in understanding these issues better.
Search can be memory intensive. If you need a better algorithm, you might want to look at "reverse indexes", or maybe use something like Lucene. However, as I understand, complete re-implementation is a question for another day
Actually, at one of my previous jobs, our main product was a search engine that was built using reverse indexes. We would index the data into a reverse index on a daily basis and then search the reverse index at every request. The search could be very fast (orders of ms for like a million records) but it was memory intensive too. SOme of the strategies we used
a) Never process all requests at once. We would queue up the requests. At a time there would be only 2 requests being processed per JVM. SInce, it was fast enough, user never noticed it if their request was queued
b) Reuse data structures. In your case, if you build some sort of queueing, think about pooling the bitmaps, and reusing between requests. That way, instead of costly spikes, you will have much more even memory usage
c) Make the search engine tunable. We had designed the data structures so the engine could search them from the disk. It didn;t need to load everything into memory. although loading everything in memory made it a lot faster albeit memory intensive. An admin could tune the engine and decide which structures stayed on disk and which in memory. Tunability may not be important for you. We were developing a generic search engine. If you have the search engine just designed for a single application, you might need that degree of tunability. However, something to think about. Maybe you could store some data structures so you can search them right from disk
d) Build for scalability. Build your search so it can be map-reduced and you can execute it on the grid
Without knowing about your architecture in detail, I don't think you have any easy wins. Maybe think about queuing and pooling. However, these things really need to be thought about during design. Retroactive improvements will only take you that far.