This week's book giveaway is in the OO, Patterns, UML and Refactoring forum. We're giving away four copies of Refactoring for Software Design Smells: Managing Technical Debt and have Girish Suryanarayana, Ganesh Samarthyam & Tushar Sharma on-line! See this thread for details.
We've been using a servlet/JSP app with a commercial database for a couple of years, but the db has been giving us trouble and we're thinking about moving, but what to we don't know. Right now, we just have a single Windows web server running Tomcat and a Linux server running our db. We're taking a hard look at PostgreSQL, but as you know, when you visit all these db vendors' sites, they all start sounding the same (it seems like it's hard to get a handle on hard performance benchmarks). Frankly, we don't know what we need.
Our site has thousands of users in all continental US time zones (some daily, some not) that use a web app, and the main core operation they perform reads from a few tables and then writes to a couple of tables with every one of these operations. There are over 11,000 of these per day at the moment, with quite a bit of other db transactions as well (updating profiles, changing preferences, etc). Most of the transactions happen between 7am EST and 8pm EST, but there are definite peak times, especially near the end of the business day. The number of transactions has tripled in the past year, and I figure it could at least double in the next year. All of the SQL consists of simple INSERT, UPDATE and DELETE statements with no joins or anything else remotely complex (because of some ODBC non-compliance stuff, we do a couple of things in Java in memory that might normally be done when querying the database).
So, I know we're definitely not serving up a complex financial app or a web app used by employees of a Fortune 100 company or something else on a huge enterprise scale, but I know we get some pretty steady traffic. Basically, how busy is our database? Might we need Oracle or DB2, or should something like PostgreSQL suffice? You can definitely factor in how easy to learn a particular database is. I've read that Oracle brought out a new small business version that's supposed to be really easy to administer when you're not an Oracle DBA. Then again, maybe a PostgreSQL book or two would be enough to get us by...
I would love some gut feeling responses from any of you. You are welcome to tell me how a particular database failed you at some points but all in all worked pretty well or whatever...you know that's really probably more helpful than just hearing how wonderful a database is! For example, if a database performs great but has lots of JDBC problems, then that's not so great for us. Thanks for any help...
[ October 19, 2004: Message edited by: Stephen Huey ]
You don't say what database it is that you're becoming dissatisfied with and why, but technically, PostgreSql should work fine for you. In other words, it certainly has the capacity to handle the loads you describe. There's no simple answer but I'm going to go out on a limb here and say that nearly any database should work fine for you. (I would exclude MS Sql Server because it tends to have problematic JDBC drivers, and most of all it's bound to a single operating system, which also is different from the one you're using.) But if you're looking for open source, PostgreSql and MySql are the most well known if not also the best.
The reason most databases are starting to sound the same is because, at least as far as most applications are concerned, essentially they are. Relational databases as used in small to medium sized applications are now basically a commodity. Most performance problems you're likely to encounter are related to either schema/design, drivers, or memory configuration. Oracle or DB2 will not provide a materially significant performance increase for a small to medium sized database, just as a trip to the grocery store to pick up bread and milk most probably won't be any quicker whether you get there in a Ferrari or a Jeep.
That doesn't mean they would be inappropriate to use, just that you probably wouldn't begin to notice any big difference until you're somewhere around 10,000+ transactions per minute (rather than per day), or when your data storage requirements get past 100 Gb. (I'm just giving an impression of the order of magnitude, not any specific threshhold.)
But, there's other stuff to consider. Backup solutions are generally better with most commercial vendors, and you get better tools for GUI admin jobs and data migration from a commercial db. Not that PostgreSql or MySql can't do those things, just that it's probably going to be easier with a commercial db.
My advice would be to experiment with several candidates. Just about everybody offers at least a trial version. Test them using your data, your schema, your application. Do backup/restore and all the other admin tasks. Investigate what books and other training and support is available. Prove to yourself, and whoever else is making the decision, that database X does or does not work for your case. Finally pick the database X that works and that you like the most, learn everything you can about how it works, migrate your data, and start using it.
I am peripherally involved with one PostGreSQL application that runs at loads similar to what you describe or higher, and the database doesn't seem to have a problem handling things.
I would recommend against MySQL because MySQL is really designed for applications where the number of reads greatly outnumbers the number of writes; this doesn't sound like the situation you have. In particular, transactions are probably useful for you, and to get transactions in MySQL, you have to give up other features (each MySQL table type supports a slightly different set of features).
If this is a commercial application, I would recommend considering Oracle. A simple Oracle installation is actually pretty easy to set up, and then you won't have to port your database again if/when you get to the point where you actually need Oracle's capabilities. You don't have to go with their full blown optimal table setup requiring seven independent disks if you don't want that level of complexity.
After deep analysis we found postgresql the best database as per its performance and working. It fulfil all our requirements.
We test Oracle, MSSQL, MySQL, SAPDB databases.
We have 15000 entries per day. I think you have the same kind a traffic. As far as performace is concern postgres is the best, compared to all above database. Number of comparision are available on net.
Further, almost all .org are going to switch over into postgres database.
Further more, we are using pgAdmin Tool that is made for postgres. You can use this fronthand tool.
Love is GOD and GOD is Love.
M/s. Anand Karia Concreting IT