Is Your JVM Leaking File Descriptors — Like Mine?

Quick! Go check!

Janaka Bandara

Updated Oct. 11, 19 · News

Likes (2)

Comment

Save

16.2K Views

is your jvm leaking?

is your jvm leaking?

the two issues described here were discovered and fixed more than a year ago. this article only serves as historical proof and a beginners' guide on tackling file descriptor leaks in java.

you may also like: difference between jdk vs jre vs jvm

in ultra esb we use an in-memory ram disk file cache for fast and garbage-free payload handling. some time back, we faced an issue on our shared saas as2 gateway where this cache was leaking file descriptors over time. eventually leading to too many open file errors when the system limit was hit.

the legion of the bouncy castle: leftovers from your stream-backed mime parts?
one culprit, we found, was bouncy castle — the famous security provider that had been our profound love since the ultra esb legacy days .

bouncy castle ftw!

with a simple, home-made toolkit , we found that bc had the habit of calling getcontent() on mime parts in order to determine their type (say, instanceof checks). true, this wasn't a crime in itself; but most of our mime parts were file-backed , with a file-cache file on the other — meaning that each getcontent() opens a new stream to the file. now there are stray streams (and hence file descriptors) pointing to our file cache.

enough of these, and we would exhaust the file descriptor quota allocated to the ultra esb (java) process.

solution? make 'em lazy!

we didn't want to mess with the bc codebase. so we found a simple solution: create all file-backed mime parts with "lazy" streams. our (former) colleague rajind wrote a lazyfileinputstream — inspired by lazyinputstream from jboss-vfs — that opens the actual file only when a read is attempted.

yeah, lazy.

bc was happy, and so was the file cache, but we were the happiest.

hibernate jpa : cleaning up after supper, a.k.a closing consumed streams

another bug we spotted was that some database operations were leaving behind unclosed file handles. apparently this was only when we were feeding stream-backed blobs to hibernate, where the streams were often coming from file cache entries.

after some digging, we came up with a theory that hibernate was not closing the underlying streams of these blog entries. (it made sense because the java.sql.blob interface does not expose any methods that hibernate could use to manipulate the underlying data sources.) this was a problem, though, because the discarded streams (and the associated file handles) would not get released until the next gc.

this would have been fine for a short-term app, but a long-running one like ours could easily run out of file descriptors; such as in case of a sudden and persistent spike.

solution? make 'em self-closing!

we didn't want to lose the benefits of streaming, but we didn't have control over our streams either. you might say we should have placed our streams in auto-closeable constructs (say, try-with-resources ). nice try; but sadly, hibernate was reading them outside of our execution scope (especially in @transactional flows). as soon as we started closing the streams within our code scope, our database operations started to fail miserably — screaming "stream already closed!"

when in rome, do as romans do , they say.

so, instead of messing with hibernate, we decided we would take care of the streams ourselves.

rajind (yeah, him again) hacked together a selfclosinginputstream wrapper . this would keep track of the amount of data read from the underlying stream, and close it up as soon as the last byte was read.

self-closing. takes care of itself!

(we did consider using existing options like autocloseinputstream from apache commons-io ; but it occurred that we needed some customizations here and there — like detailed trace logging.)

the bottom line

when it comes to resource management in java, it is quite easy to over-focus on memory and cpu (processing) and forget about the rest. virtual resources — like ephemeral ports and per-process file descriptors — can be just as important, if not more.

especially on long-running processes like our as2 gateway saas application, they can literally become silent killers.

you can detect this type of "leaks" in two main ways:

"single-cycle" resource analysis — run a single, complete processing cycle, comparing resource usage before and after.
long-term monitoring — continuously recording and analyzing resource metrics to identify trends and anomalies.

in any case, fixing the leak is not too difficult; once you have a clear picture of what you are dealing with.

good luck with hunting down your resource-hog d(a)emons!

Is Your JVM Leaking File Descriptors — Like Mine?

Quick! Go check!

the legion of the bouncy castle: leftovers from your stream-backed mime parts?
one culprit, we found, was bouncy castle — the famous security provider that had been our profound love since the ultra esb legacy days .

solution? make 'em lazy!

hibernate jpa : cleaning up after supper, a.k.a closing consumed streams

solution? make 'em self-closing!

the bottom line

further reading

Partner Resources

Related

Trending

Is Your JVM Leaking File Descriptors — Like Mine?

Quick! Go check!

the legion of the bouncy castle: leftovers from your stream-backed mime parts? one culprit, we found, was bouncy castle — the famous security provider that had been our profound love since the ultra esb legacy days .

solution? make 'em lazy!

hibernate jpa : cleaning up after supper, a.k.a closing consumed streams

solution? make 'em self-closing!

the bottom line

further reading

Related

Partner Resources

the legion of the bouncy castle: leftovers from your stream-backed mime parts?
one culprit, we found, was bouncy castle — the famous security provider that had been our profound love since the ultra esb legacy days .