First off I apologize if this is in the wrong forum. If it is, please let me know and I will move it.
My question is based on logging in the following Multi Tier Java Environment.
Application (HTTP) -> Client (HTTPS) -> Business (TCP) -> Data
The Data Tier, as of right now, will be the only tier with a database. That is not set in stone but is what it is. The way our logging used to work before this new design was everything was vomited into a Database using Log4j and then siphoned out and delivered the way we wanted. This method does not please the Manager Gods and thus a new logging strategy needs implementing.
My knowledge is very limited in this sort of environment so bear with me. I do know that we could have hundreds of clients and there would have to be a way to identify each client's logs. Also, they want the option for real time logging and a way to secure messaging if at any point one of the tiers goes down. By the way, when I say messaging, I mean actual messages like debug, info, in our case, revenue type information, things like that.
I was looking into a sort of messaging que like RabbitMQ but I don't like the idea of having to install that on all client machines. Some of our clients can be sticklers about that. Since I have null experience with this, I'm just looking for possibilities or avenues to pursue in trying to come up with a good logging design.
- Facebook's Scribe - I remember from a techtalk that this one was written specifically to overcome the problem of losing logs with regular rsyslog server if the node crashes, which is when you'd need the logs the most. It does somekind of local caching of logs before dispatching them to central server.
Unfortunately, I can't provide much insights into which one suits your requirements, because their evaluation is something I haven't come around to doing though it's been on my todos from a long time. You'll have to evaluate them from scratch.
Also, they want the option for real time logging
Can you clarify what that means? As far as I know, these systems are pretty efficient at high volume of writes, but they don't give any kind of realtime guarantees...
actual messages like debug, info, in our case, revenue type information, things like that.
Yes, that's what these systems handle too (and not just from log4j or java loggers). Only thing to keep in mind is that some log messages would be relevant only to developers, while others would be relevant to support staff or for deriving business metrics (like revenue information). It's advisable to use separate loggers for these two very different types of "logs".
Furthermore , all of your "machine data" in your multi tier environment can be sent to Splunk and correlated together to give you a really nice single pane of glass transactional view of your production systems.
You can easily tag all the log events in Splunk for each individual client, and then use Splunk's powerful search language to create operational views, dashboards, alerts and reports.
As far as getting the logs into Splunk , there are several options, but I would look at these 2. Both options provide failover / high availabilty in case a tier goes down.