I have Liferay 6 with Tomcat system setup on two machines:
Windows 2003 Server
2GB RAM, 2Gh CPU
Mysql Ver 14.14 Distrib 5.5.10
Linux CentOS 5.5
4GB RAM, 2Gh CPU
Mysql Ver 14.14 Distrib 5.1.49
Both the liferay systems are having identical startup parameters and mysql configurations.
The system contains custom theme and a servlet filter hook checking each URL access.
We have written a Grinder script to test the load of the system starting with 50 concurrent users.
The test script does the following things:
1) Open home page
2) Login with username/password
3) Enter security key (custom portlet)
4) Move to a private community
On Windows system the response time is as expected (nearly 40 seconds for each user).
However on the Linux system the response time is too high (nearly 4mins) for the same operations.
We tried revising the mysql, tomcat, connection pool and few other parameters but all resulting the same.
Also the liferay were testing using mysql of the other machines (machine 1 liferay -> machine 2 mysql)
Why the Windows machine having lesser resources responding faster than the Linux one??
It's impossible to tell without more information. Is the Linux server consuming lots of CPU, and if so, what process(es) are doing the consuming? Is there a lot of virtual memory page-swapping going on? Is the database on the Windows box but not on the Linux box? First you need to determine if the machine itself is running optimally. Then you need to home in on the offending area.
An IDE is no substitute for an Intelligent Developer.
Joined: Aug 10, 2011
Both Linux and Windows machines are part of the same network and Load Generator (Grinder) is running on a different windows machine (on the same network).
Firewall is not running on any of the machines.
Before we start test-run on any of our Liferay system, we first manually test the system many times if the system is re-booted recently.
Almost all the log files in the /var/log dir is either having older entries or in-frequent entries, to watch for another background activity.
Our production systems are RHEL and we are facing the same problem on those machines as well. To narrow down the problem we have added another Windows machine for test, but it is having nearly same result as to previous Windows machine (Machine-1).
I can understand that the problem could be because of some unknown parameters but it is getting difficult to find out.
In our recent test we came across another parameter.
In the top command we receive following values:
load average 24.64 for the last 60 seconds on our 1CPU machine.