Java 8: ConcurrentHashMap Atomic Updates
Here's a great look at what Java 8 brought to ConcurrentHashMaps, including close looks at the methods at your disposal and their performance impacts.
Join the DZone community and get the full member experience.
Join For FreeWhilst doing some refactoring on updates to ConcurrentHashMap values, I came across these great articles ...
- https://dzone.com/articles/concurrenthashmap-isnt-always-enough?fromrel=true
- https://dzone.com/articles/how-concurrenthashmap-works-internally-in-java?fromrel=true
... and was inspired to try to develop the theme a bit further.
Pre-Java 8, we had various ways to try to perform atomic operations on the values of Concurrent collections as described by Dima.
For example, a simple counter:
// Incrementing a count of the occurrences of a currency symbol
// (In reality we would have used an atomic variable even pre Java 8)
ConcurrentHashMap <String Integer> map = new ConcurrentHashMap <>();
String key = "USD/JPY";
Double oldValue; Double newValue; double increment = 1.0;
do {
oldValue = results.get(key);
newValue = oldValue == null? increment: oldValue + increment;
} while (!results.replace(key, oldValue, newValue));
Improved Methods in Java 8
- computeIfAbsent(): If the value is threadsafe and can be safely updated outside the method, or you intend to synchronize on the value whilst updating it, or if you just want to be certain of getting a new or existing value without having to check for null.
- compute(): If the value is not threadsafeand must be updated inside the method with a remapping function to ensure the entire operation is atomic. This gives you the most control over the computation, but also the responsibility to handle the possibility that there is no existing value inside your remapping function.
- merge(): Like compute(), you provide a remapping function to be performed on the existing value — if any. You also supply an initial value to be used if there is no existing value. This is a convenience because, with compute(), you have to, in the remapping function, handle the possibility of there being no existing value. Here, though, you do not have access to the key in the remapping function, unlike with compute().
The Method Signatures
xxxxxxxxxx
public V putIfAbsent(K key, V value)
public boolean replace(K key, V oldValue, V newValue)
// The argument to the mapping function is the key and the result is the
// existing value (if found) or a new value that has been computed
public V computeIfAbsent(K key, Function<? super K, ? extends V> mappingFunction)
// The arguments to the remapping bifunction are the key and the existing value
// (if found) and the result is a value that has been computed
public V compute(K key, BiFunction<? super K, ? super V, ? extends V> remappingFunction)
// The arguments to the remapping bifunction are the existing value (if found)
// and the initial value which was passed to the method in position 2 and the
// result is a value that has been computed, possibly by combining them
public V merge(K key, V value,
BiFunction<? super V,? super V,? extends V> remappingFunction)
Unlike the existing methods putIfAbsent() and replace(), instead of taking objects for the values, the arguments represent a Function or BiFunction, respectively, which can be implemented by a lambda or a method reference or even an inner class. This could be useful, as it means you don't have to instantiate an object that may well not be needed. You pass in some code that will only be executed if needed.
Atomicity
All three methods (plus computeIfPresent) are guaranteed to be atomic in ConcurrentHashMap. The way that works is that the method synchronizes on the new or existing Entry in the HashTable whilst executing.
It's important to notice that with computeIfAbsent(), a compound operation is not necessarily atomic because updates are performed outside of the method.
I try to think of them like, say, 'volatile' and AtomicInteger. Volatile guarantees visibility, but not atomicity. You won't get data races but could still get a race condition. If atomicity is required, then you use AtomicInteger, etc. to avoid race conditions
Another consideration is — are the values truly independent, or do they depend on an existing value or an external factor like 'order of arrival'?
This warning comes in the Oracle documentation:
The entire method invocation is performed atomically. Some attempted update operations on this map by other threads may be blocked while computation is in progress, so the computation should be short and simple, and must not attempt to update any other mappings of this Map.
Examples
computeIfAbsent() takes a key and a mapping function and returns a value:
xxxxxxxxxx
// This operation is atomic because updates on DoubleAdder are atomic
private final Map<Integer, DoubleAdder> doubleAdderResults = new ConcurrentHashMap<>();
doubleAdderResults.computeIfAbsent(1, (k) -> new DoubleAdder()).add(1.0);
// This is not threadsafe or atomic
// The update of the mutable non-threadsafe value takes place outside the computeIfAbsent() method
pointResults.computeIfAbsent(mapKeyZero, (k) -> new MutablePoint(0, 0)).move(10, 40);
computeIfAbsent() is simplest to use.
compute() takes a key and a value and a remapping function and returns a value:
xxxxxxxxxx
private final Map<Integer, MutablePoint> pointResults = new ConcurrentHashMap<>();
// Here, updates to the mutable value object are performed inside the compute()
// method which, itself, synchronizes on the Entry, internally
for (int i : data) {
pointResults.compute( i, (key, value) -> {
value = (value == null? new MutablePoint(0, 0): value);
value.move(5, 20);
return value;
});
}
compute() is more flexible because you have access to the key and the existing value in the remapping function and can update the existing value — or create a new one — inside the method atomically.
merge() takes a key and an initialize value to be returned if the key is not present and a remapping function.
For example:
xxxxxxxxxx
// increments a counter or initialises it with the increment if it is not found
map.merge("GBP/CHF", 1, (existingValue, newValue) -> existingValue + newValue);
// This could be simplified to
map.merge("GBP/CHF", 1, Int::sum);
Here, the initial value is combined with the existing value to calculate the new value.
Notes
All three methods should return either the existing or a new value to the calling code —therefore, you don't have to check for null. It's important to remember that ConcurrentHashMap is actually a HashTable, not a HashMap. Therefore, your function cannot return a null value to the map.
That would throw a NullPointerException.
merge() is a bit tricky in this respect. If the remapping function is executed (because the key exists) but returns null, then the existing value is removed and null returned to the caller. So, it shouldn't throw NullPointerException unless the key does not exist and the initialize value is also null
You don't have to be intending to actually do any computation. Here, I just want to make sure I get an Exceutor on which to stream market data snapshots for dollar/loonie whether it already exists or not and without having to check for null values. (This assumes mappings are never removed by my code.):
xxxxxxxxxx
private final Map<String, ExecutorService> serviceMap = new ConcurrentHashMap<>();
ExecutorService exec3 =
serviceMap.computeIfAbsent("USD/CAD", (k) -> Executors.newSingleThreadExecutor());
Testing
Here is a sample of the results running five different versions of an update operation through a Monte Carlo simulator for 1 million trials. We are simulating the tossing of a fair coin. One would expect the results to be almost exactly 50:50.
Versions tested:
- computeIfAbsent().add() with a DoubleAdder
- compute() with a Double
- computeIfAbsent() to get a Double, increment it and put it back in the map
- computeIfAbsent() to get a Double, increment it and replace it in the map - in a loop
- merge()
Look at the huge difference in the variance between updates one, two, four, and five (similar) and update three. To me, that suggests that there has been interleaving of threads in update three and the operation was not atomic. Also, the probabilities no longer add up to unity.
Performance
There doesn't seem to be any significant difference in the performance regarding the time taken to perform the updates, but I wondered about memory usage. Remember, one of the advantages of passing lambdas or method references to a method whose arguments are functional interfaces is that you are passing code, not objects. Therefore, if the code is not required, then no object needs to be instantiated.
I couldn't see any real difference in memory consumption or garbage collection between merge() and compute() in VisualVM, but then Doubles are pretty small objects. There did seem to be significantly more Doubles being created using merge(), though. Perhaps 25% more.
xxxxxxxxxx
Number of trials = 1 million
compute with a Double
Concurrent Stream. Total time: nanoseconds 2359164784 or seconds = 2.359164784
Key: tails, Value: 0.49963900, Expected: 0.5, Difference: 0.072200000 percent
Key: heads, Value: 0.50036100, Expected: 0.5, Difference: 0.072148000 percent
computeIfAbsent with a DoubleAdder
Concurrent Stream. Total time: nanoseconds 2229822699 or seconds = 2.229822699
Key: tails, Value: 0.50033000, Expected: 0.5, Difference: 0.065956000 percent
Key: heads, Value: 0.49967000, Expected: 0.5, Difference: 0.066000000 percent
computeIfAbsent with a Double incrementing the value
Concurrent Stream. Total time: nanoseconds 2313169217 or seconds = 2.313169217
Key: tails, Value: 0.46560200, Expected: 0.5, Difference: 6.8796000 percent
Key: heads, Value: 0.46416100, Expected: 0.5, Difference: 7.1678000 percent
computeIfAbsent with a Double replacing the value
Concurrent Stream. Total time: nanoseconds 2237265799 or seconds = 2.237265799
Key: tails, Value: 0.49958500, Expected: 0.5, Difference: 0.083000000 percent
Key: heads, Value: 0.50041500, Expected: 0.5, Difference: 0.082931000 percent
merge with a Double
Concurrent Stream. Total time: nanoseconds 1.9923561E+9 or seconds = 1.9923561
Key: tails, Value: 0.49968200, Expected: 0.5, Difference: 0.063600000 percent
Key: heads, Value: 0.50031800, Expected: 0.5, Difference: 0.063560000 percent
Opinions expressed by DZone contributors are their own.
Comments