Hi, folks. I'm writing my monthly column for Software Test & Performance Magazine (stpmag.com), and I'd like your input.
The topic, this time around, is Web application tuning, and I'm trying to compile a list of the most successful practices in getting those apps to run faster. Well, not precisely a *list* -- that implies a bunch of bullet points. Anecdotes and explanations are encouraged! In fact, it might be best if you imagine you've been assigned a bright, new assistant, and you're explaining to her everything you're proud of, in this regard.
When someone brings the code to you for a Web application and says, "It works, but it's WAY too slow," what are the first three things you look at? Where do you find the problems are most likely to be? (You can mention the software equivalents of "is it plugged in?" but I'm looking for the tips that aren't quite as obvious.)
Conversely, what do you see other developers do that really isn't worth the time?
These can be very specific ("I replace the foo library with the bar library; it's much faster") or general ("I count the number of database calls").
Also� are there non-technical issues that affect your ability to tune a Web application, such as political problems? (For instance, does the company want to deploy the application even when you know it's not fast enough to serve the expected demand?) What are they, and how do you deal with them?
My article is due on Sept 15, so I could really use your feedback before September 10. I'll do my best to check back here (I think it's an interesting discussion topic on its own); however, please cc me at firstname.lastname@example.org so I know you've responded.
And -- this is important -- if I may quote you in the article, please let me know (privately if necessary) how I may refer to you. I need your real name, company, title, and geographic location (Esther Schindler is a programmer at FooBar Inc. in Scottsdale, Arizona). The editors don't take well to me quoting the anonymous PookieBoy. :-)
Esther Schindler Contributing editor, Software Test & Performance magazine
Ester, This sounds like a good topic. Can't wait to read the article!
I always start with identifying the problem. If we have a profiling tool on hand, we use that to check what is taking the longest. Sometimes, in development environments these tools aren't available. In that case, we insert some System.out.printlns() around our guesses as to where the performance issue is. If the web app involves a database, it the bottleneck is almost always there.
For tuning the database, key things to look at are: appropriate indexes, the number of database trips, returning unneeded data across the network and inefficient queries.
Once the database has been ruled out, other places to look are file processing and the size of the response (maybe we are building a large PDF file.) Another thing is percieved performance. Some browsers, especially older versions of Netscape will not render a page until it is fully loaded if you leave things out (like the height and width for images tags.) For large pages, this makes it look the page takes much longer to load.
What isn't worth the time: guessing too much before running a profiler (or at least some logging statements for time.) As developers, we are often tempted to assume that minor things are taking up the time. For example, it is often mentioned that you should use StringBuffer rather than String for concatentation. This is rarely the major bottleneck in a program and therefore not a good place to start when troubleshooting an app. Another common problem is that programmers that learned to program in languages other than the one the app is build in, use those bottlenecks. For example, fear of creating objects is an issue in some languages, but not so much in Java.
Non-technical issues: For the longest time, we didn't have access to a profiler in early development environments. This severely impacted our ability to tune an application. We did have a profiler in later environments and we made heavy use of printlns in earlier environments. Another political type issue is that performance (like testing) tends to be one of the last things done. At that point, you often have to release regardless of performance issues. Developers can be proactive in tuning in the development environments and make note of issues for the next release.
Jeannes post vibes completely with my experience. The number of DB calls and DB indices are often optimization opportunities. I try not to have big releases go out w/o load testing, which isn't expensive or hard to do with some simple tools like Web Application Stress Tool or WebCat, both free from MS. Running those while having PerfMon running with lost of counters turned on provides clues as to which system resources might need to optimized.
In one of previous appliation(long back) we used to have a problem of hanging the application after working couple of days. After doing through reserach I identified that in the application while closing the database connection we forget to put closing database connection code in finally block. So many times web users, instead of finishing all the 3-4 steps keep the process in between and get disconneted so in that case database connection never get closed. So after copuple of days when the application reaches max pool size it stops responding. So whenever DAO pattern used make sure that at end of user session the db connection gets closed.
Don't try to use heavy Session objects when it is not needed.
Don't try to fetch 1000s of lines of data in memory. Before that if possible try to implement page by page interator pattern.
Before going to production environment use the load balancing tools like JProfile to find out the max capcity of application. If more then that needed then optimized it by changing JVM memory usage. Even still more capacity needed then enhance by increased capcity by vertical or horizontal scaling.
I like this thread. Here are some random thoughts in no particular order:
1) Measure don't guess - Often we guess wrong on what code to tune and end up wasting our time tuning code that makes no overall impact. Measure performance and go for those small sections of code that consume the most time. Measure your performance improvements and repeat the process.
2) Measure in production - You have no idea how your system is used unless you are looking at real user interactions. For example say you have 2 pages that each take 1 second to execute. In load tests you assume that each page will be hit the same number of times, but users think otherwise. Users may hit one of the pages orders of magnitudes more than the other one. By measuring in production you learn which page should be tuned first. Usually master pages are hit more than detail pages.
3) Build performance measurement collection into your application from the start - Web performance is difficult to understand looking in from the outside. Measure overall page response time and calls to external systems like jdbc (see the jamon servlet filter mentioned below)
4) Use the JAMon servlet filter, it is your friend - I must admit I wrote this software and so I am biased, but by simply adding a few lines to your web.xml and making jamonadmin.jsp accessible you get many performance statistics such as: page hits, avg time, min time, max time. You also get concurrency stats like which pages are currently executing, what were the max simultaneous invocations per page. You also have correlations between performance and concurrency i.e. this is a measure of scalability.
A few more things. JAMon is fast enough to be used in production systems (pretty much no overhead), it is part of your app so no special installation of software, it can monitor other things such as jdbc too, and it is free. Because JAMon is part of your application it moves from dev/test/prod with no server environmental changes.
5) It's probably the database - A recent application that I worked on took 95% of its execution time executing SQL. Of the 50 or so sql statements 80% was spent on 2 queries. These queries were tuned by adding indexes and the application zipped along with NO code changes. Only 5% of the application was doing anything in java code. I find this to be pretty typical. By the way your denormalized database can often not perform as well due to having fat tables that must perform more IO. Indexes are your friends.
6) If it's not the database it is probably some other IO - IO (database, network, file) will usually be your performance culprits, and not your java code. Tuning your jdbc driver, and network may help more than tuning your java code.
7) Learn to write good SQL - This is probably more important than any Java P&T tricks you can learn.
8) Some performance truisms: - 80% of a programs execution time is caused by 20% of the code. Your job as a tuner is to find the 20% of your code and leave the other 80% alone. - "More computing sins are committed in the name of efficiency than for any other single reason-including blind stupidity" - W.A. Wulf - "...premature optimization is the root of all evil." - Donald Knuth
9) Be wary of slow home pages - Home pages are often heavily hit. Have them due minimal work or even just display a menu. One app I work with has a home page that takes 2 seconds, but I am never interested in that page and immediately go elsewhere.
10) Don't microtune - An example of microtuning is when you have a page that takes 1 second to execute, and you notice that you can make a piece of code that takes 20 ms. down to 10 ms. You've double the performance of this code which sounds impressive, but have trimmed performance of your overall page a trivial ammount (from 1000 ms. to 990 ms). This tuning was a waste of time. Your time would have been better spent drinking a beer. Look for changes that give you the biggest bang for your tuning buck.
11) There are bad Performance Tuning Questions - The most common performance tuning questions I see on the java forums, are purely academic and make no substantive difference in a real program. Questions like: "Are statics faster than instance variables?", "Are local variables faster than instance variables?", "Are HashMaps faster than Vectors?", "Are while loops faster than for loops?",...
12) It's not average execution time but total time - If you don't know the frequency of execution (hits) then you wouldn't know which page to tune of 2 pages that take 1 second to execute. If you know one is executed 10 times a day and the other 10,000 times a day you would.
13) Good design is your best performance tuning tool - Good designs are easy to change and so easy to make faster. Good design involves having a way to measure application performance.
14) If your page takes more than a couple seconds it probably needs tuning - Users are that patient and will move to another site or start mashing on the refresh button if your page is slow. Both outcomes won't be to your liking.
15) Your biggest scalability problem may be one user getting impatient and hitting refresh before the page returns - In applications I have seen users have many many outstanding requests and so one user can single handedly bring your site down unless you build protection for this into your app from the start. (The JAMon servlet filter can detect this problem)
16) Don't return 10,000 rows - The bad news is browsers are slow in displaying so many rows. The good news is that your users really don't want to see so much data even if they say they do. It is your job as a developer to find out what they really want and give that capability to them.
17) Java is fast, very fast - Despite java's still lingering reputation of being slow it is very fast. I can execute a method on my old clunker pc 60,000,000 times a second. Even object creation and garbage collection are quite fast.