mike jack

Greenhorn
+ Follow
since Oct 21, 2018
Cows and Likes
Cows
Total received
0
In last 30 days
0
Total given
0
Likes
Total received
0
Received in last 30 days
0
Total given
5
Given in last 30 days
0
Forums and Threads
Scavenger Hunt
expand Ranch Hand Scavenger Hunt
expand Greenhorn Scavenger Hunt

Recent posts by mike jack



jstat is a simple utility tool, that is present in JDK to provide JVM performance-related statistics like garbage collection, compilation activities. The major strength of jstat is its ability to capture these metrics dynamically when JVM is running without any pre-requisite instrumentation. What do we mean by it? Say for example if you want to capture garbage collection related statistics, you need to pass below arguments before you start the JVM:



This argument will enable GC logs and print them in the specified file path. Say suppose you haven’t passed this argument, then GC related statistics will not be generated. This is where jstat can come handy. Dynamically you can connect to JVM and capture GC, compilation related statistics as shown below.

How to launch jstat?
Execute below command. It’s a simple option to launch jstat.



-gc: garbage collection related statistics will be printed

-t timestamp since JVM was started will be printed

11656: target JVM process Id

10000: statistics will be printed every 10,000 milliseconds (i.e. 10 seconds).

30: statistics will be printed for 30 iterations. Thus, the above option will cause the JVM to print metrics for 300 seconds (i.e. 10 seconds x 30 iterations).

(Note besides -gc, you can pass various other options to generate different data sets. For more details on different options, you refer here..)


Data generated by jstat
When you launch jstat with above options, here is the output that will be generated:


             
                                        Fig: jstat output

                     

Timestamp – time since the start time of the target JVM in seconds.

S0C – Survivor 0 region capacity in KB

S1C – Survivor 1 region capacity in KB

S0U – Survivor 0 region utilized space in KB

S1U – Survivor 1 region utilized space in KB

EC – Eden region capacity in KB

EU – Eden region’s utilized space in KB

OC – Old region capacity in KB

OU – Old region’s utilized space in KB

MC – Metaspace region capacity in KB

MU – Metaspace region utilized space in KB

CCSC – Compressed Class space regions capacity in KB

CCSU – Compressed Class space regions utilized space in KB

YGC – Number of Young GC events that have occurred so far

YGCT – Amount of Young GC time spent so far

FGC – Number of Full GC events that have occurred so far

FGCT – Amount of Full GC time spent so far

GCT – total amount of GC time spent so far (basically YGCT + FGCT)

How to interpret jstat output?
Equipped with this information let’s try to interpret the first line printed by the jstat tool in the above example:



                                         Fig: jstat output’s first line

Tool to analyze jstat output
One challenge with jstat is you need to manually analyze the generated statistics. It will be tedious, as you can see just to understand/interpret one single line it takes a quite long time. You can use GCeasy tool, which can parse jstat output and generate insightful graphs and metrics.


Read More here:https://blog.gceasy.io/2019/11/18/jstat-analysis/








8 months ago
One of the widely wasted resources in the world today is Memory. Due to inefficient programming, surprising (sometimes ‘shocking’) amount of memory is wasted. We see this pattern repeated in several enterprise applications. To prove this case, we conducted a small study. We analyzed the famous spring boot pet clinic application to see how much memory it is wasting. This application has been designed by the community to show how the spring application framework can be used to build simple but powerful database-oriented applications.

Environment

*Spring Boot 2.1.4.RELEASE
*Java SDK 1.8
*Tomcat 8.5.20
*MySQL 5.7.26 with MySQL Connector/J 8.0.15
*Stress Test

We used Apache JMeter, a popular open-source load testing tool, to conduct our stress test. We executed the load test for 30 minutes with the below settings:

*Number of Threads (Users) – 1000 (Number of users connects to the target)
*Ramp-Up Period (in seconds) – The time frame for all requests to start. As per our configuration at every 0.01 second, 1 new thread will start i.e 100 threads/second.
*Loop Count – These 1000 threads perform a test iterations back-to-back.
*Duration (seconds) – After ramp-up 1000 threads run continuously for 1800 seconds.


       Fig: Jmeter settings

We were exercising the following scenarios in our load test:

*Add a new pet owner to the system.
*View information pertaining to a pet owner.
*Add a new pet to a system.
*View information pertaining to a pet.
*Add information pertaining to a visit to a pet’s visitation history.
*Update the information pertaining to a pet.
*Update the information pertaining to a pet owner.
*View owner information by searching his name.
*View information of all owners.

How to measure memory wastage?

Industry has hundreds of tools to show the amount of memory used. But seldom we come across tools that can measure the amount of memory wasted due to inefficient programming.HeapHerois a simple tool that analyzes your heap dumps and tells how much memory is wasted due to inefficient programming.

We captured the heap dump from the Spring Boot Pet Clinic application when the test was running. (there are 7 different options to capture heap dumps from Java/Android applications. You can choose the option that is convenient for you).

We uploaded the captured heap dump into HeapHero tool. Tool generated this beautiful report showing that 65% of memory is wasted due to inefficient programming. Yes, this is a simple vanilla application, which is supposed to have all best practices implemented in it, that too on a highly celebrated framework is wasting 65% of memory.


Fig: Chart generated by HeapHero, showing 65% of memory is wasted by Spring Boot pet clinic application

Analyzing Memory wastage

From the report you can notice the following:

*15.6% of memory is wasted due to duplicate strings
*14.6% of memory is wasted due to inefficient primitive arrays
*14.3% of memory is wasted due to duplicate primitive arrays
*12.1% of memory is wasted due to inefficient collections

Duplicate Strings

Top reason for memory wastage in this Spring boot application (and in most enterprise application) is duplication of strings. The report shows how much memory is wasted due to duplicate of strings, what strings are they, who is creating them and how to optimize it.


                                      Fig: Duplicate Strings


You can notice that 15.6% of memory is wasted due to duplicate strings. Please note

*‘Goldi’ string is has been created 207,481 times.
*‘Visit’ string has been created 132,308 times. ‘Visit’ was the description we mentioned in the test script.
*‘Bangalore’ string has been created 75,374 times. ‘Banglore’ is the name of the city we specified in the test script.
*‘123123123’ has been created 37,687 times.
*‘Mahesh’ string has been created 37,687 times.

Apparently ‘Goldi’ is the name of the pet that was entered on the screen through the test script. ‘Visit’ was the description entered on the screen through the test script. Similarly, are the values. But the question why so many thousands of times these same string objects are created.

We all know that strings are immutable (i.e. once they are created, they can’t be modified). Given that why these many thousands of duplicate strings are created?

HeapHero tool also reports the code path where these duplicate strings are created.


        Fig: Codepath from where duplicate strings are originating


Here are the high-level recommendations to fix duplicate strings in your application. You can employ the strategies are applicable to your application.

Inefficient Collections

Another primary reason for memory wastage in the spring boot pet clinic application is inefficient collections implementation. Below is the excerpt from the HeapHero report:



       Fig: memory wasted due to inefficient collections


You can notice that 99% of LinkedHashSet in the memory doesn’t have any elements in them. If there are no elements, why even create LinkedHashSet? When you create a new LinkedHashSet object, space for 16 elements are reserved in memory. All the space reserved for those 16 elements are wasted now. If you do lazy initialization of the LinedHashset then this problem wouldn’t arise.

Bad Practice:



Bad Practice:



Similarly, another observation is: 68% of ArrayList contains only 1 element in them. When you create an ArrayList object, space for 10 elements are reserved in memory. It means in 88% of ArrayList 9 elements space is wasted. If you can initialize ArrayList with capacity this problem can be avoided.

Bad Practice: Initializing Collections with default.



Read More here: https://blog.gceasy.io/2019/11/06/memory-wasted-by-spring-boot-application/







8 months ago
Hi RJ,
This will give a real scenario of how Jenkins got rid of their performance issues by optimising their applications using GCeasy and fastThread tools. Really got a lot of information from here: https://tinyurl.com/jwperformance
11 months ago
I came across a tool named GCeasy, Universal GC Log analyser that has in-built intelligence to auto-detect problems in the JVM & Android GC logs and recommend solutions to it.You can quickly detect memory leaks, long GC pauses, premature object promotions plus many other performance impacting problems.

In this modern world, Garbage collection logs are still analyzed in a tedious & manual mode. i.e. you have to get hold of Operations engineer, then he will mail you the application's GC logs, then you will upload the logs to GC analysis tool, then you have to apply your intelligence to anlayze it. There is no programmatic way to analyze Garbage Collection logs in a proactive manner. Given this tediousness, it's impossible to analyze hundreds of JVM's GC logs in one stroke. Thus to eliminate this hassle, we have developed RESTful API to analyze garbage collection logs. With one line of code you can get your all your GC logs analyzed instantly. Learn how to use GCeasy REST API

Here are some sample reports generated by GCeasy:
1. G1 GC Report
2. Memory Leak Report

1 year ago
Long GC Pauses are undesirable for applications. It affects your SLAs; it results in poor customer experiences, and it causes severe damages to mission critical applications. Thus in this article, I have laid out key reasons that can cause long GC pauses and potential solutions to solve them.

1. High Object Creation Rate
If your application’s object creation rate is very high, then to keep with it, garbage collection rate will also be very high. High garbage collection rate will increase the GC pause time as well. Thus, optimizing the application to create less number of objects is THE EFFECTIVE strategy to reduce long GC pauses. This might be a time-consuming exercise, but it is 100% worth doing. In order to optimize object creation rate in the application, you can consider using java profilers like Jprofiler, YourKit, JVisualVM….). These profilers will report

• What are the objects that created?
• What is the rate at which these objects are created?
• What is the amount of space they are occupying in memory?
• Who is creating them?

Always try to optimize the objects which occupy the most amount of memory. Go after big fish in the pond.

Upload your GC log to the Universal Garbage Collection log analyzer tool GCeasy. This tool will report the object creation rate. There will be field by name ‘Avg creation rate’ in the section ‘Object Stats.’ This field will report the object creation rate. Strive to keep this value lower always. See the image (which is an excerpt from the GCeasy generated report), showing the ‘Avg creation rate’ to be 8.83 mb.sec.


         Tit-bit: How to figure out object creation rate?

2. Undersized Young Generation
When young Generation is undersized, objects will be prematurely promoted to Old Generation. Collecting garbage from old generation takes more time than collecting it from young Generation. Thus increasing young generation size has a potential to reduce the long GC pauses. Young Generation can be increased setting either one of the two JVM arguments

-Xmn: specifies the size of the young generation

-XX:NewRatio: Specifies ratio between the old and young generation. For example, setting -XX:NewRatio=3 means that the ratio between the old and young generation is 3:1. i.e. young generation will be fourth of the overall heap. i.e. if heap size is 2 GB, then young generation size would be 0.5 GB.

3. Choice of GC Algorithm
Choice of GC algorithm has a major influence on the GC pause time. Unless you are a GC expert or intend to become one or someone in your team is a GC expert – you can tune GC settings to obtain optimal GC pause time. Assume if you don’t have GC expertise, then I would recommend using G1 GC algorithm, because of it’s auto-tuning capability. In G1 GC, you can set the GC pause time goal using the system property ‘-XX:MaxGCPauseMillis.’ Example:



As per the above example, Maximum GC Pause time is set to 200 milliseconds. This is a soft goal, which JVM will try it’s best to meet it. If you are already using G1 GC algorithm and still continuing to experience high pause time, then refer to this article.

4. Process Swapping
Sometimes due to lack of memory (RAM), Operating system could be swapping your application from memory. Swapping is very expensive as it requires disk accesses which is much slower as compared to the physical memory access. In my humble opinion – no serious application in a production environment should be swapping. When process swaps, GC will take a long time to complete.

Below is the script obtained from Stackoverflow(thanks to the author) – which when executed will show all the process that are being swapped. Please make sure your process is not getting swapped.



If you find your process to be swapping then do one of the following:

a. Allocate more RAM to the server
b. Reduce the number of processes that running on the server, so that it can free up the memory (RAM).
c. Reduce the heap size of your application (which I wouldn’t recommend, as it can cause other side effects).

5. Less GC Threads
For every GC event reported in the GC log, user, sys and real time are printed. Example:



To know the difference between each of these times, please read the article. (I highly encourage you to read the article, before continuing this section). If in the GC events you consistently notice that ‘real’ time isn’t significantly lesser than the ‘user’ time – then it might be indicating that there aren’t enough GC threads. Consider increasing the GC thread count. Say suppose ‘user’ time 25 seconds, and you have configured GC thread count to be 5, then real time should be close to 5 seconds (because 25 seconds / 5 threads = 5 seconds).

WARNING: Adding too many GC threads will consume a lot of CPU and takes away a resource from your application. Thus you need to conduct thorough testing before increasing the GC thread count.

6. Background IO Traffic
If there is a heavy file system I/O activity (i.e. lot of reads and writes are happening) it can also cause long GC pauses. This heavy file system I/O activity may not be caused by your application. Maybe it is caused by another process that is running on the same server, still, can cause your application to suffer from long GC pauses. Here is a brilliant article from Linkedin engineers, which walks through this problem in detail.

When there is a heavy I/O activity, you will notice the ‘real’ time to be significantly more than ‘user’ time. Example:



Read More here..



1 year ago


Java.lang.VirtualMachineError is thrown when Java virtual machine encounters any internal error or resource limitation which prevents it from functioning. It’s a self-defensive mechanism employed by JVM to prevent entire application from crashing. In this article lets discuss different types of VirtualMachineError, their characteristics, reasons why they get triggered and potentials solutions to fix them.

Types of VirtualMachineError
There are four different types of VirtualMachineError:

a. OutOfMemoryError

b. StackOverflowError

c. InternalError

d. UnknownError

Let’s review these types in detail in this section


                   Fig: Java Throwable class hierarchy

# a. OutOfMemoryError
Just like OMG (Oh My God) acronym, OOM (OutOfMemoryError) is quite popular among DevOps community :-). Most DevOps engineers think that there is one OutOfMemoryError. But there are 8 different flavors of OutOfMemoryError:

1. java.lang.OutOfMemoryError: Java heap space

2. java.lang.OutOfMemoryError: GC Overhead limit exceeded

3. java.lang.OutOfMemoryError: Requested array size exceeds VM limit

4. java.lang.OutOfMemoryError: Permgen space

5. java.lang.OutOfMemoryError: Metaspace

6. java.lang.OutOfMemoryError: Unable to create new native thread

7. java.lang.OutOfMemoryError: Kill process or sacrifice child

8. java.lang.OutOfMemoryError: reason stack_trace_with_native_method

Each flavor is triggered for different reasons. Similarly, solutions are also different for each flavor of OutOfMemoryError. Here is a beautiful One-page document that summarizes all different flavors of OutOfMemoryError, their causes and solutions.

In general, OutOfMemoryError can be diagnosed and fixed by analyzing Garbage Collection logs and Heap Dumps. Since analyzing Garbage Collection logs manually can be tedious, you may consider using free tools like: GCeasy, HP Jmeter, IBM GC analyzer. Similarly to analyze heap dumps, you may consider using free tools like: HeapHero, Eclipse MAT.

# b. StackOverflowError
Thread’s stack is storing information about the methods it’s executing, primitive datatype values, local variables, object pointers, and return values. All of them consume memory. If thread’s stack sizes grow beyond the allocated memory limit, then java.lang.StackOverflowError is thrown. This problem typically happens when a thread recursively invokes same function again and again as a result of a bug in the executing program. More details on how to debug StackOverflowError and all possible solutions to fix it can be found in this article.

# c. InternalError
java.lang.InternalError is thrown by JVM when there is a:

a. Fault in the software implementing the virtual machine,

b. Fault in the underlying host system software

c. Fault in the hardware.

But rarely you will encounter InternalError. To understand what specific scenarios may cause InternalError, you may search for ‘InternalError’ string in Oracle’s Java Bug database.  At the time of writing this article (Dec’ 20, 2018), there are only 200 defects reported for this error in Oracle java bug database. Most of them are fixed.

# d. UnknownError
java.lang.UnknownError is thrown when an exception or error has occurred, but the Java virtual machine is unable to report the actual exception or error. Seldom you will see UnknownError. In fact, when searching for ‘UnknownError’ in Oracle Java Bug database at the time of writing this article (Dec’ 20, 2018), there are only 2 defects found reported.

Characteristics
VirtualMachineError has couple of primary characteristics:

1. Unchecked Exception
2. Synchronous & asynchronous delivery

Let’s discuss these two characteristics in this section.

(1). Unchecked Exception
There are two types of Exceptions:

1. Checked exceptions
2. Unchecked exceptions

Exceptions which are checked at compile time called Checked Exception. If some methods in your code throws a checked exception, then the method must either handle the exception or it must specify the exception using throws keyword. Examples of the checked exceptions are: IOException, SQLException, DataAccessException, ClassNotFoundException …

Unchecked exceptions do not have this requirement. They don’t have to be caught or declared thrown. All types of VirtualMachineError are unchecked exceptions.

(2). Synchronous & asynchronous Delivery
Exceptions can be thrown in two modes:

1. Synchronous
2. Asynchronous
Synchronous exceptions happen at a specific program statement, no matter, how many number of times program is executed in similar environment. Example of synchronous exceptions are NullPointerException, ArrayIndexOutOfBoundException, etc.…

Asynchronous exceptions can happen at any point in time and it can happen in any part of program statement. There will be no consistency where it can be thrown. All the VirtualMachineError are thrown asynchronously, but sometimes they can also be thrown synchronously. StackOverflowError may be thrown synchronously by method invocation as well as asynchronously due to native method execution or Java Virtual Machine resource limitations. Similarly, OutOfMemoryError may be thrown synchronously during object creation, array creation, class initialization, and boxing conversion, as well as asynchronously.  




1 year ago
Diagnosing and troubleshooting CPU problems in production that too in cloud environment can become tricky and tedious. Your application might have millions of lines of code, trying to identify the exact line of code that is causing the CPU to spike up, might be equivalent of finding a needle in the haystack. In this article, let’s learn how to find that needle (i.e. CPU spiking line of code) in a matter of seconds/minutes.

To help readers better understand this troubleshooting technique, we built a sample application and deployed it into AWSEC2 instance. Once this application was launched, it caused CPU consumption to spike up to 199.1%. Now let’s walk you through the steps that we followed while troubleshooting this problem. Basically, there are 3 simple steps:

1. Identify threads that consume CPU
2. Capture thread dumps
3. Identify lines of code that is causing CPU to spike up

1. Identify threads that are causing CPU to spike
In the EC2 instance, multiple processes could be running. The first step is to identify the process that is causing the CPU to spike up. Best way to do is to use the ‘TOP’ command that is present in *nixflavor of operating systems.

Issue command ‘top’ from the console



This command will display all the processes that are running in the EC2 instance sorted by high CPU consuming processes displayed at the top. When we issued the command in the EC2 instance we were seeing the below output:


 Fig:‘top’ command issued from an AWS EC2 instance

From the output, you can notice process# 31294to be consuming 199.1% of CPU. It’s pretty high consumption. Ok, now we have identified the process in the EC2 instance which is causing the CPU to spike up. Next step is to identify the threads with in this process that is causing the CPU to spike up.

Issue command ‘top -H -p {pid}’ from the console. Example



This command will display all the threads that are causing the CPU to spike up in this particular 31294 process. When we issued this command in the EC2 instance, we were seeing the below output:


 Fig:‘top -H -p {pid}’ command issued from an AWS EC2 instance

From the output you can notice:

1. Thread Id 31306 consuming 69.3%of CPU
2. Thread Id 31307 consuming 65.6%of CPU
3. Thread Id 31308 consuming 64.0%of CPU
4. Remaining all other threads consume a negligible amount of CPU.

This is a good step forward, as we have identified the threads that are causing CPU to spike. As the next step, we need to capture thread dumps so that we can identify the lines of code that is causing the CPU to spike up.

2. Capture thread dumps
A thread dump is a snapshot of all threads that are present in the application. Thread state, stack trace (i.e. code path that thread is executing), thread Id related information of each thread in the application is reported in the thread dump.

There are 8 different options to capture thread dumps. You can choose the option that is convenient for you. One of the simplest options to take thread dump is to use tool ‘jstack’ which is packaged in JDK. This tool can be found in $JAVA_HOME/bin folder. Below is the command to capture thread dump:



where

pid: is the process Id of the application, whose thread dump should be captured

file-path: is the file path where thread dump will be written in to.

Example:



As per the example, thread dump of the process would be generated in /opt/tmp/threadDump.txt file.

3. Identify lines of code that is causing CPU to spike up
Next step is to analyze the thread dump to identify the lines of code that is causing the CPU to spike up. We would recommend analyzing thread dumps through fastThread, a free online thread dump analysis tool.

Now we uploaded captured thread dump to fastThread tool. Tool generated this beautiful visual report. Report has multiple sections. On the right top corner of the report, there is a search box. There we entered the Ids of the threads which were consuming high CPU. Basically, thread Ids that we identified in step #1 i.e. ‘31306,31307, 31308’.

fastThread tool displayed all these 3 threads stack trace as shown below.


    Fig: fastThread tool displaying CPU consuming thread

You can notice all the 3 threads to be in RUNNABLE state and executing this line of code:



Apparently following is the application source code



You can see line #13 in object1.java tobe ‘doSomething();’. You can see that ‘doSomething()’ method to do nothing, but it is invoked an infinite number of times because of non-terminating while loop inline# 11. If a thread starts to a loop infinite number of times, then CPU will start to spike up. That is what exactly happening in this sample program. If non-terminating loop in line #11 is fixed, then this CPU spike problem will go away.

Conclusion
To summarize first we need to use ‘TOP’tool to identify the thread Ids that are causing the CPU spike up, then we need to capture the thread dumps, next step is to analyze thread dumps to identify exact lines of code that is causing CPU to spike up. Enjoy troubleshooting, happy hacking!





1 year ago

Not all stories need to be success stories. Reality is also not like that. We would like to share a true, disappointing story (but a phenomenal learning experience) that may be beneficial to you.

This is a story about optimizing memory utilization of a web application. This application was configured with a lot of memory (4GB) just to service handful of transactions/sec. Thus, we set out to study the memory utilization patterns of this application. We captured heap dumps of this application using ‘jmap’ tool. We uploaded the captured heap dump to HeapHero tool. HeapHero is a heap dump analysis tool just like Eclipse MAT, JProfiler, Yourkit. HeapHero tool profiled the memory and provided statistics on total classes, total objects, heap size, histogram view of large objects residing in the memory. On top of these traditional metrics, HeapHero reported the total amount of memory wasted due to inefficient programming practices. In modern computing, considerable amount memory is wasted because of inefficient programming practices such as: Duplicate object creation, suboptimal data type definitions (declaring ‘double’ and assigning only ‘float’ values), over allocation and underutilization of data structures and several other practices.

This application was no exception to it. HeapHero reported that application is wasting 56% of memory due to inefficient programming practices. Yes, it’s eyebrow raising 56%. It reported that 30% of application’s memory is wasted because of duplicate strings.


    Fig:HeapHero tool reporting amount of memory wasted due to inefficient programming

String Deduplication
Since Java 8 update 20 a new JVM argument ‘-XX:+UseStringDeduplication’ was introduced. When an application is launched with this argument, JVM will eliminate the duplicate strings from the application’s memory during garbage collection. However please be advised that ‘-XX:+UseStringDeduplication’ argument will work only with G1 GC algorithm. You can activate G1 GC algorithm by passing ‘-XX:+UseG1GC’.

We got excited. We thought just by introducing ‘-XX:+UseG1GC -XX:+UseStringDeduplication’ JVM argument, we would be able to save 30% of memory without any code refactoring. Wow, isn’t it wonderful? To verify this theory, we conducted two different tests in our performance lab:

Test 1: Passing ‘-XX:+UseG1GC’

Test 2: Passing ‘-XX:+UseG1GC -XX:+UseStringDeduplication’

We enabled Garbage collection logs on the application to study the memory usage pattern. Analyzed Garbage Collection logs using the free online garbage collection log analysis tool – GCeasy. We were hoping that in the test run #2 we would be able to see 30% reduction in the memory consumption, because of elimination of duplicate strings. However, the reality was quite different. We didn’t see any difference in the memory usage. Both test runs were consistently showing the same amount of memory utilization. See the heap usage graphs generated by the GCeasy tool by analyzing the garbage collection logs.


        Fig: GCeasy Heap usage graph with ‘-XX:+UseG1GC’


       Fig: GCeasy heap usage graph with ‘-XX:+UseG1GC -XX:+UseStringDeduplication’

In Test run #1 heap usage hovering around 1500mb all through the test, in test run #2 also heap usage was hovering around 1500mb. Disappointingly we didn’t see the anticipated 30% reduction in the memory usage, despite introducing ‘-XX:+UseG1GC -XX:+UseStringDeduplication’ JVM arguments.

Why there wasn’t reduction in heap usage?
‘Why there wasn’t reduction in heap usage?’ – this question really puzzled us. Did we configure JVM arguments rightly? Doesn’t ‘-XX:+UseStringDeduplication’ do its job correctly? Is the analysis report from the GCeasy tool is correct? All these questions troubled our sleep. After detailed analysis, we figured out the bitter truth. Apparently ‘-XX:+UseStringDeduplication’ will eliminate duplicate strings that are present in the old generation of the memory only. It will not eliminate duplicate strings in the young generation. Java memory has 3 primary regions:  young generation, old generation, Metaspace. Newly created objects go into the young generation. Objects that survived for longer period are promoted into the old generation. JVM related objects and metadata information are stored in Metaspace. Thus stating in other words ‘-XX:+UseStringDeduplication’ will only remove duplicate strings that are living for a longer period. Since this is a web application, most of the string objects were created and destroyed immediately. It was very clear from the following statistics reported in the GCeasy log analysis report:


        Fig: Object creation/promotion stats reported by Gceasy

Average object creation rate of this application is: 44.93 mb/sec, whereas average promotion rate (i.e. from young generation to old generation) is only 918 kb/sec. It’s indicative that very small of the percentage of objects are long living. Even in these 918 kb/sec promoted objects, string objects are going to be a smaller portion. Thus the amount of duplicate strings removed by ‘-XX:+UseStringDeduplication’ was very negligible. Thus, sadly we didn’t see the expected reduction in memory.

Conclusion
(a). ‘-XX:+UseStringDeduplication’ will be useful only if application has a lot of long-lived duplicate strings. It wouldn’t yield fruitful results for applications when majority of the objects are short-lived. Unfortunately, most modern web applications, micro-service applications objects are short-lived.

(b). Another famous option recommended in the industry to eliminate duplicate strings is to use String#intern() function. However, String#intern() isn’t going to be useful for this application. Because in String#intern() you end up creating the string objects and then eliminating it right after. If a string is short-lived by nature, you don’t need to do this step, as regular garbage collection process will eliminate the strings. Also, String#intern() has a possibility to add (very little) latency overhead to the transaction and CPU overhead.

(c). Given the current situation best way to eliminate duplicate strings from the application is to refactor the code to make sure duplicate strings are not even created. HeapHero points out the code paths where a lot of duplicate of strings are created. Using those pointers, we are going to continue our journey to refactor the code to reduce memory consumption.

 
       
1 year ago

Garbage collection has more profound impact on the application in contrary to what most engineers think. In order to optimize memory and garbage collection settings and to troubleshoot memory-related problems, one has to analyze Garbage Collection logs.

Enabling GC logs
GC Logging can be enabled by passing below-mentioned system properties during application startup

Until Java 8:

Below is the system property that is supported by all version of Java until JDK 8.



From Java 9:

Below is the system property that is supported by all version of Java starting from JDK 9.



How to analyze GC logs?
Here is a sample GC log generated when above system properties were passed:



GC log has rich information, however, understanding GC log is not easy. There isn’t sufficient documentation to explain GC log format. On top of it, GC log format is not standardized. It varies by JVM vendor (Oracle, IBM, HP, Azul, …), Java version (1.4, 5, 6, 7, 8, 9), GC algorithm (Serial, Parallel, CMS, G1, Shenandoah), GC system properties that you pass (-XX:+PrintGC, -XX:+PrintGCDetails, -XX:+PrintGCDateStamps, -XX:+PrintHeapAtGC …). Based on this permutation and combination, there are easily 60+ different GC log formats.

Thus, to analyze GC logs, it’s highly recommended to use GC log analysis tools such as GCeasy, HPJmeter. These tools parse GC logs and generate great graphical visualizations of data, reports Key Performance Indicators and several other useful metrics.



1 year ago
In early 1970s 1 MB was costing 1 million $. Now 1 mb is costing fraction of that cost. There is no comparison. This is one of the reasons why engineers and enterprises don’t worry about memory any more. 1 million $ in 1970s might be equivalent of several millions of dollars’ worth today. Thus, back in the day’s memory was treated so preciously. This preciousness has been vividly described in the book ‘Idea Man’ – autobiography of Paul Allen (Microsoft Co-founder). Paul Allen talks about the challenge he and Bill Gates faced in writing BASIC programming language (Microsoft’s very first product) under 4 KB.

There are 4 primary computing resources:

1.CPU
2.Memory
3.Network
4.Storage

Your application might be running on tens, thousands of application server instances. In above mentioned 4 computing resources, which resource does your application instance saturates first? Pause for a moment here, before reading further. Give a thought to figure out which resource gets saturated first.

For most applications it is *memory*. CPU is always at 30 – 60%. There is always abundant storage. It’s hard to saturate network (unless your application is streaming video content). Thus, for most applications it’s the memory that is getting saturated first. Even though CPU, storage, network is underutilized, just because memory is getting saturated, you end up provisioning more and more application server instances increasing the computing cost of organizations to millions/billions dollars.

On the other hand, without exception modern applications wastes anywhere from 30 – 90% of memory due to inefficient programming practices. Below are 9 different practices that is followed in the industry, that is causing memory wastage:

1.Duplicate Strings
2.Inefficient Collections
3.Duplicate Objects
4.Duplicate arrays
5.Inefficient arrays
6.Objects waiting for finalization
7.Boxed numbers
8.Overhead caused by Object Headers
9.Wrong memory size settings

If you can eliminate this 30 – 90% of memory wastage, it will bring 2 major advantages to your enterprise:

(a). Reduce computing cost:
As memory is the first resource to get saturated, if you can reduce memory consumption, you would be able to run your application on a smaller number of server instances. You might be able to reduce 30 – 40% of servers. It means your management can reduce 30 – 40% of the datacenter (or cloud hosting provider) cost, maintenance and support cost. It can account for several millions/billions of dollars in cost savings.

(b). Better customer experience:
If you are reducing number of objects that you are creating to service incoming request, your response time will get lot better. Since less objects are created, less CPU cycles will be spent in creating and garbage collecting them. Reduction in response time will give a lot better customer experience.

How much memory is my application wasting?
So now let’s get back to original question of this article: ‘How much memory is my application wasting?’. Sad story is: there aren’t that many tools in the industry that can give you this answer. There are tools which can answer the question: ‘How much memory is my application using?’ like TOP, Application Performance Monitoring (APM) tools, Nagios… ‘Using’ is different from ‘Wasting’. But we will help you answer this question. You just need to follow these 2 steps to get to this answer:

Step 1: Capture Heap Dumps
In order to analyze, how memory is wasted, you first need to capture the heap dump. Heap Dump is basically a snapshot of your application’s memory. It has information what all are the objects that are present in memory, who is referencing it.

There are 7 options to capture heap dumps from your Java application. https://blog.heaphero.io/2017/10/13/how-to-capture-java-heap-dumps-7-options/

There are 3 options to capture heap dumps from android application. https://blog.heaphero.io/2018/06/04/how-to-capture-heap-dump-from-android-app-3-options/

You can use the option that is more suited for you.

“It’s recommended to capture heap dump when your application is taking traffic. When application is idle, new objects will not be created, thus you wouldn’t be able to see how much memory is actually wasted”.



Step 2: Analyze using HeapHero tool
Once you have captured heap dumps, you can upload the captured heap dumps to the free online heap dump analysis tool –HeapHero.

“Depending on your application, Heap dumps can contain PII data and other sensitive data. In such circumstance you can register here http://heaphero.io/heap-trial-registration.jsp to download and install the HeapHero tool locally in your machine”.



Tool will give high level summary of total amount of memory that is wasted, actual data that is causing memory wastage, lines of code that is triggering this memory wastage and recommends solutions to fix the problem.


        Fig: Summary showing how much memory is wasted in total

HeapHero prints a pie graph that summarizes how much memory is wasted due to each inefficient programming practice. Apparently, this application is wasting 35% of its memory. Top two reasons for this wastage are:

a. Inefficient collections (i.e. Data Structures) is causing 11.5% of memory wastage.

b. Duplication of strings is causing 10.4% of memory wastage.


             Fig: Memory wasted due to duplicate Strings

In this application there are 343,661 instances of string objects. Out of it, only 144,520 of them are unique. Apparently, there are 6,147 instances of “webservices.sabre.com” string objects. Why there as to be so many duplicates? Why can’t be cleaned and free-up memory?


              Fig: Memory wasted due to inefficient collections
In this application 34% of HashMap contains no elements. This brings question why so many HashMaps are created with without any elements. When you create a HashMap by default it creates 16 elements. Space allocated for those 16 elements gets wasted, when you don’t insert any elements in to it. Even more interesting observation is 97% of Hashtable doesn’t contain any elements.

 
           Fig: Lines of code that is originating duplicate Strings

Tool not only shows the objects that are wasting memory, but it also shows the code path that is causing triggering the memory wastage. This will facilitate the engineers to institute right fix in the right part of the application.

Happy hacking, happy coding!
1 year ago
If deadlock happens in a JVM, the only way to recover from the situation is to restart the JVM.
Here is a sample code which simulates deadlock condition in-between two threads:



In the above code following is the execution path of ‘FirstThread‘:

1. Acquires lock on the Lock1 object
2. Sleeps for 10 seconds interval
3. Acquires lock on Lock2 object

Following is the execution path of ‘SecondThread‘:

1. Acquires lock on the Lock2 object
2. Sleeps for 10 seconds interval
3. Acquires lock on Lock1 object

Thread dump captured on the above code would look like:


Deadlock detection
Good article! About capturing heap dumps, HeapHero is also good. It is free web tool. You have to upload your your application's heap dumps & review the beautiful reports instantly.
1 year ago
Can anybody suggest me a DevOps tools to Troubleshoot Memory leak, GC pauses instantly?
1 year ago