Meaningless Drivel is fun!
The moose likes Performance and the fly likes Web analytics, best performance? Big Moose Saloon
  Search | Java FAQ | Recent Topics | Flagged Topics | Hot Topics | Zero Replies
Register / Login
JavaRanch » Java Forums » Java » Performance
Bookmark "Web analytics, best performance?" Watch "Web analytics, best performance?" New topic

Web analytics, best performance?

Joshua Silva

Joined: Aug 17, 2013
Posts: 5
For an internal web analytics platform here the traffic is around 15 million hits per month. That only equals out to around 7 request per second, say 25 during peak times. We are curious though the best way to make a web analytics platform very fast and scalable.

So basically similar to google analytics, the platform has a snippet of JS, that then goes and fires and SQL query. Now the question is, should we update this query on the fly, or should we just do an insert and let another process, *process* the data and update it for the end user (so they can see up to date analytics).

Should a relational db be used for this insert? Or would something else be faster? Then parse that *log file* or whatnot into the DB? Maybe that would be quicker than hitting the database every request, and doing a batch import into the database every 30 seconds or every minute. This follows along the theory that opening a connection and doing 1k queries is faster than opening and doing 1 and closing etc etc for every request.

Maybe there is a completely different approach for this, that we are just not aware of. Any input would be great.

Thank you
Ulf Dittmer

Joined: Mar 22, 2005
Posts: 42965
The first thing that comes to mind is to decouple gathering the stats and saving them. Push the incoming stats into some king of queue, and have a lower-priority job process that queue by saving it in whichever way you want to save it. That way high traffic, or a slowdown in the DB (or file system) doesn't affect the speed of stats gathering.
Joshua Silva

Joined: Aug 17, 2013
Posts: 5
Any other thoughts on this? Ways to pull it off, so it could scale up?

The max we will do is probably around 30 million hits per month, but still. I would like to make it as good as I possibly could. Anyone with analytics experience would be very appreciated.

fred rosenberger
lowercase baba

Joined: Oct 02, 2003
Posts: 11955

Joshua Silva wrote:I would like to make it as good as I possibly could.

Just an observation - that isn't really a very good spec. One can always improve things, if one is willing to spend more time/money/resources. The law of diminishing returns certainly applies here.

So, come up with a specific, quantifiable spec, with actual numbers and statistics, not vague 'make it better' rhetoric. That's the only way you'll know if you've hit your target. You can certainly go back and revise the specs if you need to, but you need to have an obtainable goal.

There are only two hard things in computer science: cache invalidation, naming things, and off-by-one errors
I agree. Here's the link:
subject: Web analytics, best performance?
It's not a secret anymore!