aspose file tools*
The moose likes Performance and the fly likes running a large number of Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login


Win a copy of Spring in Action this week in the Spring forum!
JavaRanch » Java Forums » Java » Performance
Bookmark "running a large number of "counting QL" on a noSQL data store.  Algorithm or a Best practice needed " Watch "running a large number of "counting QL" on a noSQL data store.  Algorithm or a Best practice needed " New topic
Author

running a large number of "counting QL" on a noSQL data store. Algorithm or a Best practice needed

Mohammad Shafie
Greenhorn

Joined: Jan 27, 2011
Posts: 2
Hello Everyone

before we start let me give you some information about our environment:
* it is written fully in Java/J2EE.
* it is developed to be deployed on GAE "Google App Engine"
* its GUI is developed by GWT.
* our problem is in a core development issue.



Here is my problem,
* i am building a web application where users "online" can search for listings in this website.
* first please open the web site careerbuilder.com and search for any keyword e.g. "Accounting".
* a page will be opened , [Narrow Search] has a way to allow you go to your target job easier "lets call this a filter" ,lots of jobs down there.
* search filter includes sub-filters [Category , Company , City , State ].
* each sub-filter has many cases or options. like for "State has (California ,Iowa , Kansas , ...etc)" beside each one of them is the number of jobs that matches your current filter/sub-filter selection. you will find it between brackets i.e. (23)



Now we want to allow this filter functionality and we want to make it fast.
making a count query for each sub-filter option is going to be an effective idea.

kindly keep in mind that:
* users can add/remove listing.
* also listings can expire.
* number of sub-filters are higher for us "can reach 20".
* each sub-filter has between 2 and 200 options.


we are searching for the best practice or a suggestion of an algorithm or whatever to solve this problem.

here are 2 options we have reached so far:
1) building a statistics table to save these results in it, then update it each time listings number is changed , also keep a nightly background job to recalculate results. and we can show number of results directly from this table.

2) build a tree data structure to be loaded on memory and saved in a table each time it is updated. this tree contains the resulting numbers of listings in each option of sub-filters.


even though i still think this is not enough !!!
can anyone suggest a better idea?
all comments, questions, suggestions are very welcomed.



Regards
Mohammad S.
 
wood burning stoves
 
subject: running a large number of "counting QL" on a noSQL data store. Algorithm or a Best practice needed