• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Jeanne Boyarsky
  • Ron McLeod
  • Paul Clapham
  • Liutauras Vilda
Sheriffs:
  • paul wheaton
  • Rob Spoor
  • Devaka Cooray
Saloon Keepers:
  • Stephan van Hulst
  • Tim Holloway
  • Carey Brown
  • Frits Walraven
  • Tim Moores
Bartenders:
  • Mikalai Zaikin

Troubleshooting deadlock in an Apache opensource library

 
Ranch Hand
Posts: 122
  • Mark post as helpful
  • send pies
    Number of slices to send:
    Optional 'thank-you' note:
  • Quote
  • Report post to moderator


Apache PDFBox is a popular open-source library that facilitates java applications to work with PDF documents. Recently we encountered a Deadlock that surfaced in this library. In this post we have shared how we troubleshooted and identified the root cause of the problem.

What is Deadlock?
First let’s try to understand what ‘Deadlock’ means. Several technical definitions aren’t clear. ‘Deadlock’ definition is one among them :-). Deadlock’s definition goes like this: “Deadlock is a situation where a set of processes are blocked because each process is holding a resource and waiting for another resource acquired by some other process.”

How to troubleshoot Deadlock in an Apache opensource library
[VIDEO]https://youtu.be/Jke2hzya4Do[/VIDEO]

It’s always easier to learn something new through examples and pictures. Let’s look at the below practical example, which may help you to understand Deadlock better.


Fig1: Trains starting in the same track


Fig2: Trains experiencing Deadlock

Let’s say there is only one train track, and this train track has six parts(part-1, part-2, part-3, part-4, part-5, part-6). Train-A starts at part-1 and Train-B starts at Part-6 on the same train track at the same time. Both trains travel at the same speed. Under this circumstance, Train-A and Train-B will reach a Deadlock state when they reach part-3 and part-4 of the train track. Because when Train-A is in part-3 of the train track, it will be stuck waiting for part-4 of the track, which Train-B holds. On the other hand, when Train-B is in part-4, it will be stuck waiting for part-3, which Train-A holds. Thus, both the trains can’t move forward. This is a classic Deadlock situation. Similarly, once a Deadlock happens in the application, it cannot be recovered. The only way to recover from Deadlock is to restart the application. To learn more about Deadlock basics & troubleshooting, you may refer to this blog post

Troubleshooting Deadlock
Now let’s discuss about the Deadlock problem we faced in the application. From the above explanation you can understand that Deadlock is caused due to threads. Thus, to analyze Deadlock, you need to capture thread dump from the application. Thread dump is basically a snapshot of all threads that are running in your application. It contains information such as: stack trace, thread state, thread priority, … You can capture thread dump using one of the approaches given here

Note: Most of the time, you will not know whether the actual problem in your application is deadlock or not. What you will notice is unresponsiveness from the application. Thus, it’s safe to capture all the prominent artifacts that are essential for troubleshooting such as: Garbage Collection log, thread dump, heap dump, netstat, iostat,… you may use yCrash open source script, which would capture 360-degree data (GC log, 3 snapshots of thread dump, heap dump, netstat, iostat, vmstat, top, top -H,…) from your application stack within a minute and generate a bundle zip file.

We uploaded the captured thread dump to fastThread – a thread dump analysis tool. Tool immediately pointed out that the two threads caused the deadlock. Below is the excerpt from the fastThread report.


Fig: Deadlock pointed by fastThread

Tool pointed out the stack trace of the two threads that were in deadlock. Below is the stack trace of those two threads:





You can see the ‘APP_Thread_50_WorkerTask_pool-5-thread-6’ thread is in Deadlock with ‘APP_Thread_50_WorkerTask_pool-5-thread-5’ thread. From the stacktrace you can observe following two things:

‘APP_Thread_50_WorkerTask_pool-5-thread-6’ has acquired the lock ‘0x00000002d218ca28’ of the ‘org.apache.fontbox.ttf.RAFDataStream’ object and waiting to acquire the lock ‘0x00000002d216fec8’ of the ‘org.apache.fontbox.ttf.TrueTypeFont’ object.
On the other hand, ‘APP_Thread_50_WorkerTask_pool-5-thread-5’ thread is trying to do the exact opposite, it acquired the lock ‘0x00000002d216fec8’ of ‘org.apache.fontbox.ttf.TrueTypeFont’ object and waiting to acquire the lock ‘0x00000002d218ca28’ of the ‘org.apache.fontbox.ttf.RAFDataStream’ object.

Indeed, it’s a classic Deadlock condition.

Deadlock in Apache PDFBox
If you notice two objects which are causing Deadlock are ‘org.apache.fontbox.ttf.TrueTypeFont’ and ‘org.apache.fontbox.ttf.RAFDataStream’. Both of these objects are originating from the open source Apache PDFBox library.

Once seeing this bug, we searched in the Apache PDFBox bug database to see whether this problem was already reported or not. We couldn’t see this problem reported earlier. Thus, we went ahead and filed a new ticket in the Apache bug database with the details. Here is the ticket that we filed, for your reference. The Apache PDFBox development team was highly responsive. They started acting on it right away and issued a fix within 2 – 3 days. Great job from the PDFBox team. Truly enjoyed the open source community collaboration.
 
Consider Paul's rocket mass heater.
reply
    Bookmark Topic Watch Topic
  • New Topic