Troubleshooting Deadlock in an Apache Open-Source Library
Learn how we troubleshot a deadlock that surfaced in Apache PDFBox, an open-source library that facilitates Java applications to work with PDF documents.
Join the DZone community and get the full member experience.
Join For FreeApache PDFBox is a popular open-source library that facilitates Java applications to work with PDF documents. Recently, we encountered a deadlock that surfaced in this library. In this post, we have shared how we troubleshot and identified the root cause of the problem.
What Is Deadlock?
First, let’s try to understand what "deadlock" means. Several technical definitions aren’t clear. The "deadlock" definition is one among them. Deadlock’s definition goes like this:
Deadlock is a situation where a set of processes are blocked because each process is holding a resource and waiting for another resource acquired by some other process.
It’s always easier to learn something new through examples and pictures. Let’s look at the below practical example, which may help you to understand deadlock better.
Figure 1: Trains starting on the same track
Figure 2: Trains experiencing deadlock
Let’s say there is only one train track, and this train track has six parts (Part 1, Part 2, Part 3, Part 4, Part 5, and Part 6). Train A starts at Part 1 and Train B starts at Part 6 on the same train track at the same time. Both trains travel at the same speed. Under this circumstance, Train A and Train B will reach a deadlock state when they reach Part 3 and Part 4 of the train track. This happens because when Train A is in Part 3 of the train track, it will be stuck waiting for Part 4 of the track, which Train B holds. On the other hand, when Train B is in Part 4, it will be stuck waiting for Part 3, which Train A holds. Thus, neither train can move forward. This is a classic deadlock situation. Similarly, once a deadlock happens in the application, it cannot be recovered. The only way to recover from deadlock is to restart the application. To learn more about deadlock basics and troubleshooting, you may refer to my previous article, "Chaos Engineering: Deadlock."
Troubleshooting Deadlock
Now let’s discuss the deadlock problem we faced in the application. From the above explanation, you can understand that the deadlock is caused due to threads. Thus, to analyze deadlock, you need to capture the thread dump from the application. A thread dump is basically a snapshot of all threads that are running in your application. It contains information such as stack trace, thread state, and thread priority. You can capture thread dumps using one of the approaches given in "8 Options for Capturing Thread Dumps."
Note: Most of the time, you will not know whether the actual problem in your application is a deadlock or not. What you will notice is the unresponsiveness of the application. Thus, it's safe to capture all the prominent artifacts that are essential for troubleshooting, such as the Garbage Collection log, thread dump, heap dump, netstat, and iostat. You may use yCrash open-source script, which would capture 360-degree data (GC log, 3 snapshots of thread dump, heap dump, netstat, iostat, vmstat, top, top -H, etc.) from your application stack within a minute and generate a bundle zip file.
We uploaded the captured thread dump to fastThread, a thread dump analysis tool. The tool immediately pointed out that the two threads caused the deadlock. Below is the excerpt from the fastThread report.
Figure 3: Deadlock pointed by fastThread
The tool pointed out the stack trace of the two threads that were in deadlock. Below is the stack trace of those two threads:
"APP_Thread_50_WorkerTask_pool-5-thread-6" #1899 prio=5 os_prio=0 tid=0x00007f894c405555 nid=0x44d1
waiting for monitor entry [0x00007f88e9c44000]
java.lang.Thread.State: BLOCKED (on object monitor)
at org.apache.fontbox.ttf.TrueTypeFont.getTable(TrueTypeFont.java:147)
- waiting to lock <0x00000002d216fec8> (a org.apache.fontbox.ttf.TrueTypeFont)
at org.apache.fontbox.ttf.TrueTypeFont.getHorizontalMetrics(TrueTypeFont.java:229)
at org.apache.fontbox.ttf.GlyphTable.getGlyphData(GlyphTable.java:210)
at org.apache.fontbox.ttf.GlyphTable.getGlyph(GlyphTable.java:191)
- locked <0x00000002d218ca28> (a org.apache.fontbox.ttf.RAFDataStream)
at org.apache.fontbox.ttf.TrueTypeFont.getPath(TrueTypeFont.java:676)
at org.apache.pdfbox.pdmodel.font.PDType1Font.getPath(PDType1Font.java:638)
at org.apache.pdfbox.rendering.Type1Glyph2D.getPathForCharacterCode(Type1Glyph2D.java:83)
at org.apache.pdfbox.rendering.PageDrawer.drawGlyph2D(PageDrawer.java:495)
at org.apache.pdfbox.rendering.PageDrawer.showFontGlyph(PageDrawer.java:476)
at org.apache.pdfbox.contentstream.PDFStreamEngine.showGlyph(PDFStreamEngine.java:787)
at org.apache.pdfbox.contentstream.PDFStreamEngine.showGlyph(PDFStreamEngine.java:805)
at org.apache.pdfbox.contentstream.PDFStreamEngine.showText(PDFStreamEngine.java:743)
at org.apache.pdfbox.contentstream.PDFStreamEngine.showTextString(PDFStreamEngine.java:606)
at org.apache.pdfbox.contentstream.operator.text.ShowText.process(ShowText.java:56)
at org.apache.pdfbox.contentstream.PDFStreamEngine.processOperator(PDFStreamEngine.java:933)
at
org.apache.pdfbox.contentstream.PDFStreamEngine.processStreamOperators(PDFStreamEngine.java:514)
at org.apache.pdfbox.contentstream.PDFStreamEngine.processStream(PDFStreamEngine.java:492)
at org.apache.pdfbox.contentstream.PDFStreamEngine.processPage(PDFStreamEngine.java:155)
at org.apache.pdfbox.rendering.PageDrawer.drawPage(PageDrawer.java:277)
at org.apache.pdfbox.rendering.PDFRenderer.renderImage(PDFRenderer.java:347)
at org.apache.pdfbox.rendering.PDFRenderer.renderImage(PDFRenderer.java:268)
at org.apache.pdfbox.rendering.PDFRenderer.renderImage(PDFRenderer.java:228)
at
xxxxxxx.export.service.thumbnail.PDFSlideGeneratorServiceImpl$1.call(PDFSlideGeneratorServiceImpl.java:60
)
at xxxxxxx.ops.executor.util.ParallelExecutorTask.doCall(ParallelExecutorTask.java:43)
at xxxxxxx.ops.executor.util.ParallelExecutor.executeTaskFromQueue(ParallelExecutor.java:154)
at xxxxxxx.ops.executor.util.ParallelExecutor.access$100(ParallelExecutor.java:26)
at xxxxxxx.ops.executor.util.ParallelExecutor$WorkerTask.doCall(ParallelExecutor.java:164)
at xxxxxxx.ops.executor.util.UserContextAwareCallable.call(UserContextAwareCallable.java:89)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
"APP_Thread_WorkerTask_APP-pool-5-thread-5" #1898 prio=5 os_prio=0 tid=0x00007f894c898900 nid=0x44d0
waiting for monitor entry [0x00007f88e9d45000]
java.lang.Thread.State: BLOCKED (on object monitor)
at org.apache.fontbox.ttf.TrueTypeFont.readTable(TrueTypeFont.java:356)
- waiting to lock <0x00000002d218ca28> (a org.apache.fontbox.ttf.RAFDataStream)
at org.apache.fontbox.ttf.TrueTypeFont.getTable(TrueTypeFont.java:150)
- locked <0x00000002d216fec8> (a org.apache.fontbox.ttf.TrueTypeFont)
at org.apache.fontbox.ttf.TrueTypeFont.getCmap(TrueTypeFont.java:262)
at org.apache.fontbox.ttf.TrueTypeFont.getUnicodeCmapImpl(TrueTypeFont.java:556)
at org.apache.fontbox.ttf.TrueTypeFont.getUnicodeCmapLookup(TrueTypeFont.java:541)
at org.apache.fontbox.ttf.TrueTypeFont.nameToGID(TrueTypeFont.java:629)
at org.apache.fontbox.ttf.TrueTypeFont.hasGlyph(TrueTypeFont.java:698)
at org.apache.pdfbox.pdmodel.font.PDType1Font.getNameInFont(PDType1Font.java:601)
at org.apache.pdfbox.pdmodel.font.PDType1Font.hasGlyph(PDType1Font.java:645)
at org.apache.pdfbox.rendering.Type1Glyph2D.getPathForCharacterCode(Type1Glyph2D.java:59)
at org.apache.pdfbox.rendering.PageDrawer.drawGlyph2D(PageDrawer.java:495)
at org.apache.pdfbox.rendering.PageDrawer.showFontGlyph(PageDrawer.java:476)
at org.apache.pdfbox.contentstream.PDFStreamEngine.showGlyph(PDFStreamEngine.java:787)
at org.apache.pdfbox.contentstream.PDFStreamEngine.showGlyph(PDFStreamEngine.java:805)
at org.apache.pdfbox.contentstream.PDFStreamEngine.showText(PDFStreamEngine.java:743)
at org.apache.pdfbox.contentstream.PDFStreamEngine.showTextString(PDFStreamEngine.java:606)
at org.apache.pdfbox.contentstream.operator.text.ShowText.process(ShowText.java:56)
at org.apache.pdfbox.contentstream.PDFStreamEngine.processOperator(PDFStreamEngine.java:933)
at
org.apache.pdfbox.contentstream.PDFStreamEngine.processStreamOperators(PDFStreamEngine.java:514)
at org.apache.pdfbox.contentstream.PDFStreamEngine.processStream(PDFStreamEngine.java:492)
at org.apache.pdfbox.contentstream.PDFStreamEngine.processPage(PDFStreamEngine.java:155)
at org.apache.pdfbox.rendering.PageDrawer.drawPage(PageDrawer.java:277)
at org.apache.pdfbox.rendering.PDFRenderer.renderImage(PDFRenderer.java:347)
at org.apache.pdfbox.rendering.PDFRenderer.renderImage(PDFRenderer.java:268)
at org.apache.pdfbox.rendering.PDFRenderer.renderImage(PDFRenderer.java:228)
at
xxxxxxx.export.service.thumbnail.PDFSlideGeneratorServiceImpl$1.call(PDFSlideGeneratorServiceImpl.java:60
)
at xxxxxxx.ops.executor.util.ParallelExecutorTask.doCall(ParallelExecutorTask.java:43)
at xxxxxxx.ops.executor.util.ParallelExecutor.executeTaskFromQueue(ParallelExecutor.java:154)
at xxxxxxx.ops.executor.util.ParallelExecutor.access$100(ParallelExecutor.java:26)
at xxxxxxx.ops.executor.util.ParallelExecutor$WorkerTask.doCall(ParallelExecutor.java:164)
at xxxxxxx.ops.executor.util.UserContextAwareCallable.call(UserContextAwareCallable.java:89)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
You can see the APP_Thread_50_WorkerTask_pool-5-thread-6
thread is in a deadlock with the APP_Thread_50_WorkerTask_pool-5-thread-5
thread. From the stack trace, you can observe the following two things:
APP_Thread_50_WorkerTask_pool-5-thread-6
has acquired the lock0x00000002d218ca28
of theorg.apache.fontbox.ttf.RAFDataStream
object, and is waiting to acquire the lock0x00000002d216fec8
of theorg.apache.fontbox.ttf.TrueTypeFont
object.- On the other hand,
APP_Thread_50_WorkerTask_pool-5-thread-5
thread is trying to do the exact opposite. It acquired the lock0x00000002d216fec8
oforg.apache.fontbox.ttf.TrueTypeFont
object and is waiting to acquire the lock0x00000002d218ca28
of theorg.apache.fontbox.ttf.RAFDataStream
object.
Indeed, it’s a classic deadlock condition.
Deadlock in Apache PDFBox
If you notice two objects which are causing deadlock are org.apache.fontbox.ttf.TrueTypeFont
and org.apache.fontbox.ttf.RAFDataStream
. Both of these objects are originating from the open-source Apache PDFBox library.
Once we saw this bug, we searched in the Apache PDFBox bug database to see whether this problem was already reported or not. We couldn’t see this problem reported earlier. Thus, we went ahead and filed a new ticket in the Apache bug database with the details. Here is the ticket that we filed for your reference. The Apache PDFBox development team was highly responsive. They started acting on it right away and issued a fix within 2–3 days. Great job from the PDFBox team. I truly enjoyed the open-source community collaboration.
Video
Published at DZone with permission of Ram Lakshmanan, DZone MVB. See the original article here.
Opinions expressed by DZone contributors are their own.
Comments