Win a copy of JDBC Workbook this week in the JDBC and Relational Databases forum
or A Day in Code in the A Day in Code forum!

jim terry

Ranch Hand
+ Follow
since Nov 18, 2018
Cows and Likes
Cows
Total received
0
In last 30 days
0
Total given
0
Likes
Total received
1
Received in last 30 days
0
Total given
12
Given in last 30 days
0
Forums and Threads
Scavenger Hunt
expand Ranch Hand Scavenger Hunt
expand Greenhorn Scavenger Hunt

Recent posts by jim terry

Thread dumps are vital artifacts to debug & troubleshoot production performance problems. Thread dump files tend to span for several hundreds of lines (sometimes a few thousands of lines). It is hard to conceive and assimilate all the information in the thread dump due to its verbosity. If information present in the thread dumps can be condensed into one compact flame graph, it would make our analysis job much easier.

In this article, let us review how to generate a Flame graph from your Java application. Using the steps outlined in this article, you can not only generate flame graphs from Java applications but also from any programming language running on JVM such as Java, Scala, Jython, kotlin, jruby,…

How to generate Flame Graph?
There are two simple steps to generate a flame graph from your application:

1. Capture thread dumps
2. Analyze with fastThread tool

Let us discuss these steps in detail.

1. Capture thread dumps:
Capture the thread dumps when a performance problem is brewing in the application. There are 8 different options to capture thread dumps You can use the option that best suits your environment.

My favorite option is ‘jstack’. ‘jstack’ is an effective command line tool to capture thread dumps. jstack tool is shipped in JDK_HOME/bin folder. Here is the command that you need to issue to capture thread dump:


where

pid: is the Process Id of the application, whose thread dump should be captured

file-path: is the file path where thread dump will be written into.

Example:



As per the example thread dump of the process would be generated in /opt/tmp/threadDump.txt file.

2. Analyze with fastThread tool:
Once thread dump is captured, upload the generated thread dump to fastThread tool. fastThread is a free online thread dump analysis tool that will analyze your thread dump and generate a beautiful analysis report instantly. Here is the sample report generated by fastThread. At the bottom of this report, you will be able to see Flame Graph.

Note: if you are not comfortable in uploading your thread dump to the fastThread tool, which is running on the cloud, you can register here to download and install the tool locally on your machine and then do the analysis.


Fig: Flame Graph generated by fastThread

How to use Flame Graph?
From the Flame Graph, you will be able to see lines of code that your application is executing. The number of threads that are executing those lines of code. You can also observe most executed 3rd party libraries, frameworks. The flame graph provides you zoom a tower in the flame graph. It also gives you the capability to search method, class or package name in your application. For more details on how to use this Flame Graph effectively, you can watch the below video clip.





Conclusion
Flame graphs give a compact view of your thread dumps. Instead of going through hundreds of lines of stack traces in the thread dump, it would be a lot easier to visualize information in the graphical format.



Thread dumps are vital artifacts to troubleshoot/debug production problems. In the past we have discussed several effective thread dump troubleshooting patterns like: traffic jam, treadmill, RSI, all roads lead to rome ………. In this article we would like to introduce one more thread dump troubleshooting pattern.

How to capture thread dumps?
There are 8 different options to capture thread dumps. You can use the option that is convenient to you.

Thread dumps tend to contain Exceptions or Errors in the threads stack trace. The threads that contain Exceptions or Errors in their stack trace should be investigated. Because they indicate the origin of the problem.

Recently an application was throwing java.lang.OutOfMemoryError. Thread dump was captured from this application. When we analyzed the thread dump, we could notice a particular thread to be throwing java.lang.OutOfMemoryError:



From this stacktrace we were able to figure out that this thread is experiencing OutOfMemoryError when it’s trying to transform xml into java objects.

Apparently sufficient memory wasn’t allocated to this application to process large size XML payloads. Thus when large size XMLs were sent to this application, it started throwing OutOfMemoryError. When sufficient memory was allocated (i.e. increasing -Xmx value), the problem got resolved. Thus looking for Exception or Errors in the thread dumps is a good pattern to identify the root cause of the problem.

But looking for exceptions or errors in a thread dump is not a trivial thing. Because thread dumps tend to contain hundreds or thousands of threads. Each thread will have several lines of stack trace. Going through each line of stack trace to spot exceptions or errors is a tedious process. This where thread dumps analysis tools comes handy. You might consider using free thread dump analysis tools like: fastThread, IBM TDMA, Samurai, … to analyze your application thread dumps.

When you upload thread dump to the fastThread application, it generates a root cause analysis report. One of the sections in this report is ‘Exception’. In this section fastThread application reports all the threads that are throwing Exceptions or Errors. Below is the screenshot of this section:



Fig: ‘Exception’ section in fastThread report

You can notice this section is reporting all the threads that have Exceptions or Errors in their stack trace. If any threads are reported in this section, you should consider investigating those thread stack traces to identify the origin of the problem.



Java Heap fragmentation is an interesting problem, which triggers long pausing full garbage collection cycles. In this article we would like to attempt to explain the heap fragmentation.

Let’s say developer write a code ‘new BMW()’. This will create a new BMW object in the heap memory space. Example:


Fig: New object created in JVM Heap

Let’s say now application creates 3 more new objects: ‘Toyota’, ‘Honda’, ‘Tesla’. Now JVM’s heap memory will start to look like this:


Fig: More objects are created in JVM Heap

Let’s say after some time, no other objects are referencing Toyota and Tesla objects, then garbage collector will remove Toyota and Tesla objects from the memory. Thus, after garbage collection cycle now heap memory will start to look like this:


Fig: Garbage collection removes unused objects from JVM Heap

Let’s say now application is creating a new ‘Truck’ object which is bigger in size. Even though there is sufficient space to accommodate Truck object in memory, there isn’t enough contiguous space to store the Truck object in the memory, because heap is fragmented.


Fig: JVM heap is fragmented, there is not enough contiguous space to accommodate ‘Truck’ object

When heap is fragmented like this Full Garbage Collection is triggered. Full GC will start compacting the memory i.e. it move BMW and Honda objects next to each other, so that there is enough contiguous memory. Typically, Full GC which does the compaction ends up pausing JVM for a longer duration.

Note: To study the amount of heap fragmentation and pause times incurred by your application due to Full garbage collection  cycles and steps to reduce them you may consider using free garbage collection log analysis tools like GCeasy, Garbage Cat,
HP Jmeter.


Fig: JVM Heap after compaction

After compaction there will be sufficient contiguous memory to store the Truck object into memory. Thus, Truck object can be stored in memory.


Fig: JVM heap able to accommodate ‘Truck’ object after compaction





4 weeks ago
In this article we are going to discuss how we troubleshooted a CPU spike problem that surfaced in a major trading application in North America. All of a sudden, this application’s CPU started to spike up to 100%. In fact, this team didn’t make any new code deployment, any environmental changes, they didn’t flip any flag settings – but all of a sudden the CPU started to spike up. We even validated whether there was an increase in the traffic volume, which was attributing to this traffic volume spike. But there was no increase in the traffic volume as well.

Data Capturing
This application was running on Java, Tomcat technology stack. We requested Site Reliability Engineering (SRE) team to capture following two artifacts from the machine where this problem was happening:

1. top -H output

2. Thread dumps

Let’s see what these artifact contains in this section.

1. top -H
Always CPU spikes are caused because of threads. So we had to isolate the threads which were causing this CPU spike. Apparently this application has hundreds of threads. From these hundreds of threads we need to identify the threads which are causing the CPU consumption to spike up? This is the first challenge.

This is where ‘top’ unix command line utility tool comes handy. Most of you might be familiar with ‘top’ unix command line utility. This tool shows all the processes that are running on the device. It also shows the CPU, memory consumed by each of those processes.

This tool has a secret ‘-H’ option :-), which lot of engineers aren’t familiar with. It can be invoked like this:



Where PID is your application’s process Id. Apparently this application’s process ID is 31294. Thus the SRE team issued this command.



When the ‘top’ tool is invoked with this ‘-H’ option, it will start to display all the threads that are running within that particular process. It will also display the amount of CPU and memory consumed by each thread within that process. Below is the output we got from this trading application:


Fig: top -H -p {pid}


As you can see from the “top -H” output, there is a thread with id ‘11956’ in the first row. Just this thread alone is consuming 60.9% of the CPU. Bingo!! This is the first win. Now using this ‘top -H’ option, we have identified the thread which is consuming a high amount of CPU.

2. Thread dumps
Our next challenge is to identify the lines of code that this ‘11956’ thread is executing. This is where thread dumps come handy. Thread dumps shows all the threads that are running within the application and their code execution path (i.e. stack trace). https://blog.fastthread.io/2016/06/06/how-to-take-thread-dumps-7-options/";" target="_new" rel="nofollow">This blog highlights 8 different options to capture the thread dumps. You can use the option that is convenient to you. We used the ‘jstack’ tool which is part of the JDK to capture the thread dumps.

Best Practise: Capture at least 3 thread dumps,in a gap of 10 seconds between each thread dump.

Analyzing the data
Now we uploaded both the ‘top -H’ output and ‘thread dumps’ to https://fastthread.io/";" target="_new" rel="nofollow">fastThread tool. If you are not sure how to upload the ‘top -H’ output and thread dumps to fastThread tool, https://blog.fastthread.io/2020/03/28/powerful-troubleshooting-marrying-top-thread-dumps/";" target="_new" rel="nofollow">here are the instructions for it. This tool has the ability to analyze thread dumps, and ‘top -H’ output and generate intuitive reports. This tool analyzed and generated https://fastthread.io/my-thread-report.jsp?p=c2hhcmVkLzIwMjAvMDMvMzEvLS1pYm0tY29yZS1kdW1wLXRvcGRhdGEuemlwLS0xNS0yMi0yNQ==&s=t";" target="_new" rel="nofollow">this beautiful report.

This report contains ‘CPU | Memory’ section. This is the section where the tool marries top -H output and thread dump and provides a CPU & memory consumed by each thread in the application.



Fig: CPU, Memory consumed by each thread (generated by fastThread)

In the above screenshot we can see the “WebContainer:18” thread in the first row reported to  be consuming “60.9%” CPU. On the last column you can see the code execution path (i.e. stack trace) of this thread. You can see the tool now reporting the thread name (i.e. ‘WebContainer:18’) and the code path that it was executing (which wasn’t available to use when we saw the raw top -H output).



Fig: Stacktrace of high CPU consuming thread (generated by fastThread)

Resolution
You can notice that ‘WebContainer: 18’ thread was executing the java.util.WeakHashMap#put() line of code. For the folks, who aren’t aware, HashMap is not a thread safe implementation. When multiple threads invoke HashMap’s get() & put() methods concurrently then it can result in infinite looping. When thread loops infinitely CPU consumption will start to sky rocket. That’s the exact problem happening in this application. Once WeakHashMap was replaced with ConcurrentHashMap the problem got resolved.

We hope you find this simple technique useful.


1 month ago
Since we are analyzing thousands of Garbage Collection logs every single day through our https://gceasy.io/  tool, we are noticing several java applications still continuing to use ‘-XX:+UseCompressedOops’ JVM argument. Actually, it’s not required to pass ‘-XX:+UseCompressedOops’ JVM argument if you are running in Java SE 6 update 23 and later. Compressed oops is supported and enabled by default in Java SE 6u23 and later versions.

“OOP” means Ordinary Object Pointer. It means a managed pointer to an object. An OOP is usually the same size as a native machine pointer, which means an OOP is 64 bits on a 64-bit operating system and 32 bits on a 32-bit operating system.

In a 32-bit operating system, JVM can address only up to 4GB (i.e., 2 ^ 32) memory. To manage larger amounts of memory, the JVM needs to run with 64-bit OOP, which can manage 18.5 Exabytes (i.e., 2 ^ 64 bytes). Very large size :-). Actually, there aren’t any servers in the world which has so much memory.

What this means is that for a twice as big address, a much larger memory area can be addressed. But this doesn’t comes at ‘free of cost’. When you switch from a 32-bit JVM to a 64-bit JVM, you will notice the loss of available memory by 1.5 times due to larger addresses. To fix this problem, JVM developers invented ‘Compressed Oops’ feature. More details about this compressed OOPs can be found here https://docs.oracle.com/javase/7/docs/technotes/guides/vm/performance-enhancements-7.html.

To activate this feature, you were required to pass ‘-XX:+UseCompressedOops’ JVM argument. However, starting from Java SE 6u23 and later versions, the use of compressed oops is made as default. Thus you don’t need to pass this flag explicitly anymore.
2 months ago
At the time (March 2020) of writing this article there are 600+ arguments that you can pass to JVM just around Garbage collection and memory. If you include other aspects, total JVM arguments count will easily cross 1000+. It’s way too many arguments for anyone to digest and comprehend. In this article, we are highlighting seven important JVM arguments that you may find it useful.

1. -Xmx and -XX:MaxMetaspaceSize
-Xmx is probably the most important JVM argument. -Xmx defines the maximum amount of heap size you are allocating to your application. (To learn about different memory regions in a JVM, you may watch this short video clip). You can define your application’s heap size like this:



Heap size plays a critical role in determining your

a. Application performance

b. Bill, that you are going to get from your cloud provider (AWS, Azure,…)

This brings question, what is the right heap size for my application? Should I allocate a large heap size or small heap size for my application? Answer is: ‘It depends’. In this article, we have shared our thoughts whether you need to go with large or small heap size.

You might also consider reading this article: advantages of setting -Xms and -Xmx to same value

Metaspace is the region where JVM’s metadata definitions, such as class definitions, method definitions, will be stored.  By default, the amount of memory that can be used to store this metadata information is unlimited (i.e. limited by your container or machine’s RAM size). You need to use -XX:MaxMetaspaceSize argument to specify an upper limit on the amount of memory that can be used to store metadata information.



2. GC Algorithm
As on date (March 2020), there are 7 different GC algorithms in OpenJDK:

a. Serial GC

b. Parallel GC

c. Concurrent Mark & Sweep GC

d. G1 GC

e. Shenandoah GC

f. Z GC

g. Epsilon GC

If you don’t specify the GC algorithm explicitly, then JVM will choose the default algorithm. Until Java 8, Parallel GC is the default GC algorithm. Since Java 9, G1 GC is the default GC algorithm.

Selection of the GC algorithm plays a crucial role in determining the application’s performance. Based on our research, we are observing excellent performance results with Z GC algorithm. If you are running with JVM 11+, then you may consider using Z GC algorithm (i.e. -XX:+UseZGC). More details about Z GC algorithm can be found here

Below table summarizes the JVM argument that you need to pass to activate each type of Garbage Collection algorithm.



3. Enable GC Logging
Garbage Collection logs contain information about Garbage Collection events, memory reclaimed, pause time duration, … You can enable Garbage collection log by passing following JVM arguments:

From JDK 1 to JDK 8:




From JDK 9 and above:



Example:




Typically GC logs are used for tuning garbage collection performance. However, GC logs contain vital micro metrics. These metrics can be used for forecasting application’s availability and performance characteristics. In this article we would like to highlight one such micrometric: ‘GC Throughput‘ (to read more on other available micrometrics, you may refer to this article). GC Throughput is the amount of time your application spends in processing customer transactions vs the amount of time it spends in processing GC activities. Say if your application’s GC throughput is 98%, then it means application is spending 98% of its time in processing customer activity, and the remaining 2% is spent in GC activity.

Now let’s look at the heap usage graph of a healthy JVM:



Fig: Healthy JVM’s heap usage graph (generated by https://gceasy.io)

You can see a perfect saw-tooth pattern. You can notice that when Full GC (red triangle) runs, memory utilization drops all the way to bottom.

Now let’s look at the heap usage graph of a sick JVM:



Fig: Sick JVM’s heap usage graph (generated by https://gceasy.io)

You can notice towards the right end of the graph, even though GC repeatedly runs, memory utilization isn’t dropping. It’s a classic indication that the application is suffering from some sort of memory problem.

If you take a closer look at the graph, you will notice that repeated full GC’s started to happen right around 8 am. However, the application starts to get OutOfMemoryError only around 8:45 am. Till 8 am, the application’s GC throughput was about 99%. But right after 8 am, GC throughput started to drop down to 60%. Because when repeated GC runs, the application wouldn’t be processing any customer transactions and it will only be doing GC activity. As a proactive measure, if you notice GC throughput starts to drop, you can take out the JVM from the load balancer pool. So that unhealthy JVM will not process any new traffic. It will minimize the customer impact.



Fig: Repeated Full GC happens way before OutOfMemoryError

You can monitor GC related micrometrics in real time, using GCeasy REST API

4. -XX:+HeapDumpOnOutOfMemoryError, -XX:HeapDumpPath
OutOfMemoryError is a serious problem that will affect your application’s availability/performance SLAs. To diagnose OutOfMemoryError or any memory-related problems, one would have to capture heap dump right at the moment or few moments before the application starts to experience OutOfMemoryError. As we don’t know when OutOfMemoryError will be thrown, it’s hard to capture heap dump manually at the right around the time when it’s thrown. However, capturing heap dumps can be automated by passing following JVM arguments:

-XX:+HeapDumpOnOutOfMemoryError and -XX:HeapDumpPath={HEAP-DUMP-FILE-PATH}

In ‘-XX:HeapDumpPath’, you need to specify the file path where heap dump should be stored. When you pass these two JVM arguments, heap dumps will be automatically captured and written to a defined file path, when OutOfMemoryError is thrown.

Example:



Once heap dumps are captured, you can use tools like HeapHero, EclipseMAT to analyze heap dumps.

More details around the OutOfMemoryError JVM arguments can be found in this article

5. -Xss
Each application will have tens, hundreds, thousands of threads. Each thread will have its own stack. In each thread’s stack following information are stored:

a. Methods/functions that are currently executed

b. Primitive datatypes

c. Variables

d. object pointers

e. return values.

Each one of them consumes memory. If their consumption goes beyond a certain limit, then StackOverflowError is thrown. More details about StackOverflowError & it’s solution can be found in this article. However, you can increase the thread’s stack size limit by passing -Xss argument.

Example:



If you set this -Xss value to a huge number, then memory will be blocked and wasted. Say suppose you are assigning -Xss value to be 2mb whereas, it needs only 256kb, then you will end up wasting huge amount of memory, not just 1792kb (i.e. 2mb – 256kb). Do you wonder why?

Say your application has 500 threads, then with -Xss value to be 2mb, your threads will be consuming 1000mb of memory (i.e. 500 threads x 2mb/thread). On the other hand, if you have allocated -Xss only to be 256kb, then your threads will be consuming only 125mb of memory (i.e. 500 threads x 256kb/thread). You will save 875mb (i.e. 1000mb – 125mb) of memory per JVM. Yes, it will make such a huge difference.

Note: Threads are created outside heap (i.e. -Xmx), thus this 1000mb will be in addition to -Xmx value you have already assigned. To understand why threads are created outside heap, you can watch this short video clip

Our recommendation is to start from a low value (say 256kb). Run thorough regression, performance, and AB testing with this setting. Only if you experience StackOverflowError then increase the value, otherwise consider sticking on to a low value.

6. -Dsun.net.client.defaultConnectTimeout and -Dsun.net.client.defaultReadTimeout
Modern applications use numerous protocols (i.e. SOAP, REST, HTTP, HTTPS, JDBC, RMI…) to connect with remote applications. Sometimes remote applications might take a long time to respond. Sometimes it may not respond at all.

If you don’t have proper timeout settings, and if remote applications don’t respond fast enough, then your application threads/resources will get stuck. Remote applications unresponsiveness can affect your application’s availability. It can bring down your application to grinding halt. To safeguard your application’s high availability, appropriate timeout settings should be configured.

You can pass these two powerful timeout networking properties at the JVM level that can be globally applicable to all protocol handlers that uses java.net.URLConnection:

sun.net.client.defaultConnectTimeout specifies the timeout (in milliseconds) to establish the connection to the host. For example, for HTTP connections, it is the timeout when establishing the connection to the HTTP server.
sun.net.client.defaultReadTimeout specifies the timeout (in milliseconds) when reading from the input stream when a connection is established to a resource.
Example, if you would like to set these properties to 2 seconds:



Note, by default values for these 2 properties is -1, which means no timeout is set. More details on these properties can be found in this article

7. -Duser.timeZone
Your application might have sensitive business requirements around time/date. For example, if you are building a trading application, you can’t take transaction before 9:30 am. To implement those time/date related business requirements, you might be using java.util.Date, java.util.Calendar objects. These objects, by default, picks up time zone information from the underlying operating system. This will become a problem; if your application is running in a distributed environment. Look at the below scenarios:

a. If your application is running across multiple data centers, say, San Francisco, Chicago, Singapore – then JVMs in each data center would end up having different time zone. Thus, JVMs in each data center would exhibit different behaviors. It would result in inconsistent results.

b. If you are deploying your application in a cloud environment, applications could be moved to different data centers without your knowledge. In that circumstance also, your application would end up producing different results.

c. Your own Operations team can also change the time zone without bringing to the development team’s knowledge. It would also skew the results.

To avoid these commotions, it’s highly recommended to set the time zone at the JVM using the -Duser.timezone system property. Example if you want to set EDT time zone for your application, you will do:



Conclusion
In this article, we have attempted to summarize some of the important JVM arguments and their positive impacts. We hope you may find it helpful.










3 months ago
In this article we will see how to install and configure Apache2 web server in Ubuntu 16.04

Note: Throughout this article, we will be referring to domain name as website1-example.com. Replace this domain name with your actual domain name whenever required.

Step – 1 : Install Apache2 web server
We will begin with updating the local package to reflect the latest upstream changes. Afterwards we can install the Apache2 package.



The status can be check by running the following commands



You can access the default apache landing page to confirm that the software is running properly. You can access this through your server’s domain name or IP address.

Step – 2 : Check web server
Run below command to make sure the service running



Now you can access the default apache landing page to confirm that the software is running properly. You can access it through your server’s domain name or IP address.

For example: http://www.website1-example.com

Step – 3 : Create virtual host
In Apache on Ubuntu all the virtual host configuration files are stored under /etc/apache2/sites-available directory. With the new Apache installation you can find a default virtual host file called 000-default.conf there. We will create a new virtual host configuration file by copying 000-default.conf file.



Open your virtual host file,



The file should look like the following:



Now edit this file as per your requirement. My configuration looks like below:



  • ServerAdmin: Server admin’s email address.
    ServerName: The domain that should match for this virtual host configuration. This should be your domain name. i.e. website1-example.com
    ServerAlias: It is an additional matching condition that needs to be processed. i.e. http://www.website1-example.com
    DocumentRoot: The directory from which Apache will serve the domain files.
    Options: This directive controls which server features are available in a specific directory.
    ErrorLog, CustomLog: Specifies the location of log files.



  • Step – 4 : Create project directory
    By default the document root directory is /var/www/html. We will create a website1-example.com directory in www directory as defined in the above virtual host configuration.



    Now let’s create a test HTML file called index.html in a root directory we just created in a previous step.



    Add the following code to the file and then save it.



    Step – 5 : Enable the virtual host
    Enable the virtual host using the a2ensite tool:



    Apply the changes to Apache



    Next, open /etc/hosts file in editor and add your domain/IP address like below:



    For example:



    Save and close the file.

    Step – 6 : Enable CORS
    Now we will enable CORS on apache2 server. CORS is a process which tells browsers to access resources from different origin (domain, protocol, port) via HTTP headers

    Enable headers by typing:



    Open /etc/apache2/apache2.conf file by typing following command and add cross-origin headers in <Directory> section



    For example:



    Figure : CORS Configuration

    Step – 7 : Enable ports
    If you are using ports other than default port number 80 then we need to enable that port. In step 3 we have configured a virtual host on port 8090. Let’s enable port 8090 in Apache2.

    Open /etc/Apache2/ports.conf file. In this file add your port number.

    For example:



    Save and close the file.

    Restart your apache2 service to reflect all changes.


    3 months ago
    Do you want to learn right analysis patterns, tools and best practices to troubleshoot production performance problems? Do you want to learn how to analyze thread dumps, heap dumps, GC logs? Do you want to learn with real world examples which caused outages in major enterprises? Then this is the talk for you.

    After this talk, troubleshooting CPU spikes, OutOfMemoryError, response time degradations, network connectivity issues may not stump you.

    3 months ago
    Recently we were troubleshooting a popular SaaS application. This application was slowing down intermittently. To recover from the problem, the application had to be restarted. This application was slowing down sometimes during high traffic volume periods; sometimes during low traffic periods as well. There was no cohesive pattern.

    This sort of application slowing down and restarting it was going on for a while. Then we were engaged to troubleshoot the problem. We uncovered something interesting, thought you might also benefit from our findings, thus writing this article.

    Technology Stack

    This popular SaaS application was running on the Azure cloud. Below is it's technology stack:

  • Spring Framework
  • GlassFish Application Server
  • Java 8
  • Azure cloud


  • Troubleshooting

    When we were informed about this problem, we captured thread dump from this application right when slowdown was happening. There are multiple options to capture thread dump; we choose 'jstack' tool to capture the thread dump. Note: It's very critical that you obtain the thread dump right when the problem is happening. Thread dumps captured outside the problem duration window wouldn’t be useful.

    Now we uploaded the captured thread dump to fastThread.io - online thread dump analysis tool. The tool instantly generated this beautiful report. (We encourage you to click on the hyperlink to see the generated report so that you can get first-hand experience).

    The report instantly narrowed down the root cause of the problem. fastThread.io tool highlighted that 'http-nio-8080-exec-121' thread was blocking 134 application threads. Below is the transitive dependency graph showing the BLOCKED threads:



    Fig: fastThread.io showing transitive dependency of the BLOCKED threads

    From the graph you can see 134 applications threads are BLOCKED by 'http-nio-8080-exec-121' thread (first one from left side). When we clicked on the 'http-nio-8080-exec-121' hyperlink in the graph it printed the stack trace of the thread:



    Fig: http-nio-8080-exec-121 obtained org.apache.log4j.Logger lock

    I request you to take a close look at the highlighted section of the stack trace. You can see the thread obtaining org.apache.log4j.Logger lock and then moving forward to write the log records into Azure cloud storage.

    Now let's take a look at the stack trace of 'http-nio-8080-exec-56' thread (one of the 134 threads which was BLOCKED):



    Fig: http-nio-8080-exec-56 waiting to obtain org.apache.log4j.Logger lock

    Take a look at the highlighted section of the above stack trace. It's waiting to acquire org.apache.log4j.Logger lock. You can see the 'http-nio-8080-exec-56' thread to be in BLOCKED state, because 'http-nio-8080-exec-114' acquired org.apache.log4j.Logger lock and didn't release it.

    Remaining all 134 threads were also stuck waiting for the 'org.apache.log4j.Logger' lock. Thus whenever any application thread attempted to log, it got into this BLOCKED state. Thus 134 application threads ended into this BLOCKED state.

    We then googled for org.apache.log4j.Logger BLOCKED thread. We stumbled upon this interesting defect reported in the Apache Log4j bug database.

    It turned out that this is one of the known bugs in the Log4J framework, and it's one of the primary reasons why the new Log4j2 framework was developed. Below is the interesting excerpt from this defect description:

    <code:start>
    There is no temporary fix for this issue and is one of the reasons Log4j 2 came about. The only fix is to upgrade to Log4j 2.

    :
    :

    Yes, I am saying that the code in Log4j 2 is much different and locking is handled much differently. There is no lock on the root logger or on the appender loop.
    <code:end>

    Due to the bug, any thread which was trying to log, got into the BLOCKED state. It caused the entire application to come to a grinding halt. Once the application was migrated from Log4j to Log4j2 framework, the problem got resolved.

    Conclusion

    Log4j has reached EOL (End of life) in Aug’ 2015. It’s no longer supported. If your application is still using the Log4J framework, we highly recommend upgrading to the apache Log4j2 framework. Here is the migration guide. Log4j2 isn't just the next version of Log4j framework; it's a new framework written from scratch. It has a lot of performance improvements.

    Also, now you were able to learn how to troubleshoot an unresponsive application.

    5 months ago
    There are excellent Heap dump analysis tools like Eclipse MAT, Jprofiler, ... These tools are handy when you want to debug/troubleshoot OutOfMemoryError. However, we HeapHero as following unique capabilities which aren't available in those tools:

    1. How much memory wasted?

    HeapHero tells you how much memory your application is wasting because of inefficient programming practices by the developers. Today memory is wasted because of reasons like:

    a. Duplication of strings
    b. overallocation, and underutilization of data structures
    c. Boxed numbers
    d. Several more reasons.

    You can see HeapHereo reporting how much memory is wasted even in a vanilla pet clinic spring boot application. Other tools don't provide this vital metric.

    2. First cloud application for heap dump analysis

    Today's memory profiling tools need to be installed on your Desktop/Laptops. They can't run on the cloud. HeapHero can run on

    a. Public cloud (AWS, Azure,..)
    b. Your private data center
    c. Local machine

    Your entire organization can install one instance of HeapHero in a central server, and everyone in the organization can upload and analyze the heap dump from this one server.

    3. CI/CD pipeline Integration

    As part of CI/CD pipeline, several organizations do static code analysis using tools like coverity, vera code... . Using HeapHero, you can do runtime code analysis. HeapHeroprovides REST API. This API returns JSON response, which contains key metrics related to your application's memory utilization. You can invoke this API from CI/CD pipeline and see whether your code quality is improving or regressing between each code commit.

    4. Instant RCA in production

    Debugging OutOfMemoryError in production is a tedious/challenging exercise. You can automate the end-end analysis of OutOfMemoryError using HeapHero. Say if your application's memory consumption goes beyond certain limits or experience OutOfMemoryError, you can capture heap dumps and do heap dump analysis instantly using our REST API and generate instant root cause analysis report. Production troubleshooting tools like ycrash leverages HeapHero REST API to do this analysis for you.

    5. Analyzing heap dumps from remote location

    Heap dump files are large in size (several GB). To troubleshoot the heap dump, you have to transmit the heap dump file from your production server to your local machine. From your local machine, you have to upload the heap dump file to your tool. Sometimes heap dump might be stored/archived in remote server, AWS S3 storage,... In those circumstances, you will have to download the heap dump from that remote location and then once again upload it to the tool. HeapHero simplifies this process for you. You can pass the heap dump's remote location URL as input to the HeapHero API or to web interface  directly. HeapHero will download the heap dump from this remote location to analyze for you.

    6. Report Sharing & Team collaboration

    Sharing Heap Dumps amongst team is a cumbersome process. Finding a proper location to store the heap dump file is the first challenge. The team member with whom you are sharing this report should have the heap dump analysis tool installed on his local machine. So that he can open the heap dump file with the tool you are sharing to see the analysis report. HeapHero simplifies this process. HeapHero gives you a hyperlink like this. This hyperlink can be embedded in your emails, JIRA, and circulated amongst your team. When your team member clicks on this hyperlink, he can see the entire heap dump analysis report on his browser.

    HeapHero also lets you export you heap dump as PDF file. This PDF file can also be circulated amongst your team members.

    7. Analyze large size heap dumps

    Several memory profilers are good at analyzing heap dumps of smaller size. But they struggle to analyze large size heap dumps. HeapHero is geared to analyze heap dumps easily.
    5 months ago