Measuring String Concatenation in Logging
Join the DZone community and get the full member experience.
Join For Freeintroduction
i had an interesting conversation today about the cost of using string concatenation in log statements. in particular, debug log statements often need to print out parameters or the current value of variables, so they need to build the log message dynamically. so you wind up with something like this:
logger.debug("parameters: " + param1 + "; " + param2);
the issue arises when debug logging is turned off. inside the
logger.debug()
statement a flag is checked and the method returns immediately; this is generally pretty fast. but the string concatenation had to occur to build the parameter prior to calling the method, so you still pay its cost. since debug tends to be turned off in production, this is the time when this difference matters.
for this reason, we have pretty much all been trained to do this:
if (logger.isdebugenabled()) { logger.debug("parameters: " + param1 + "; " + param2); }
the discussion was about how much difference this “good practice” makes.
caliper
this kind of question is perfect for a micro benchmark. my own favorite tool for this purpose is caliper . caliper runs small snippets of code enough times to average out variations. it passes in a number of repetitions, which it calculates in order to make sure that the whole method takes long enough to measure given the resolution of the system clock. caliper also detects garbage collection and hotspot compiling that might impact the accuracy of the tests.
caliper uploads results to a google app engine application. its sign-in supports google logins and issues an api key that can be used to organize results and list them.
a typical timing methods looks like this:
public string timemultstringnocheck(long reps) { for (int i = 0; i < reps; i++) { logger.debug(strings[0] + " " + strings[1] + " " + strings[2] + " " + strings[3] + " " + strings[4]); } return strings[0]; }
the return string is not used; it is included in the method solely to ensure that java does not optimize away the method. similarly, the content of the variables used should be randomly generated to avoid compile-time optimization.
the full example is available in
one of my github repositories
, in the
org.anvard.introtojava.log
package.
results
the outcome is pretty interesting.
string concatenation creates a pretty significant penalty, around two orders of magnitude for our example that concatenates five strings. interesting is that even in the case where we do not use string concatenation (i.e. the
simplestring
methods), the penalty is around 4x. this is probably the time spent pushing the string parameter onto the stack.
the examples with doubles, using
string.format()
, is even more extreme, four orders of magnitude. the elapsed time here about 4ms, large enough that if the log statement were in a commonly used method, the performance hit would be noticeable.
the final method,
multstringparams
, uses a feature that is available in the
slf4j
api. it works similarly to
string.format()
, but in a simple token replace fashion. most importantly, it does not perform the token replace unless the logging level is enabled. this makes this form just as fast as the “check” forms, but in a more compact form.
of course, this only works if no special formatting is needed of the log string, or if the formatting can be shifted to a method such as
tostring()
. what is especially surprising is that this method did not show a penalty in building the object array necessary to pass the parameters into the method. this may have been optimized out by the java runtime since there was no chance of the parameters being used.
conclusion
the practice of checking whether a logging level is enabled before building the log statement is certainly worthwhile and should be something teams check during peer review.
Published at DZone with permission of Alan Hohn, DZone MVB. See the original article here.
Opinions expressed by DZone contributors are their own.
Comments