Java Concurrency: CopyOnWrite
Why do we need this technique? And how to use this technique correctly?
Join the DZone community and get the full member experience.
Join For FreeCopyOnWrite
is a common Java implementation that allows you to update a data structure in a thread-safe way. The main advantage of CopyOnWrite
is that reading threads never get blocked.
Why do we need this technique? And how to use this technique correctly? In this post, we take a closer look at CopyOnWrite
, why we need it, and how to use it.
You may also like: A Bird's Eye View on Java Concurrency Frameworks
Why CopyOnWrite?
In the following, I want to implement a thread-safe class representing an address. To make this example brief, the address consists only of the street, city, and phone number.
You can download the source code of all examples from GitHub here.
public class MutableAddress {
private volatile String street;
private volatile String city;
private volatile String phoneNumber;
public MutableAddress(String street, String city, String phoneNumber) {
this.street = street;
this.city = city;
this.phoneNumber = phoneNumber;
}
public String getStreet() {
return street;
}
public String getCity() {
return city;
}
public void updatePostalAddress(String street ,String city ) {
this.street = street;
this.city = city;
}
@Override
public String toString() {
return "street=" + street +
",city=" + city +
",phoneNumber=" + phoneNumber;
}
}
I use volatile fields, lines 2 through 4, to make sure that the threads always see the current values, as explained in greater detail here.
To check if this class is thread-safe, I use the following test:
public class ConcurrencyTestReadWrite {
private final MutableAddress address = new MutableAddress("E. Bonanza St."
, "South Park" , "456 77 99");
private String readAddress;
@Interleave(ConcurrencyTestReadWrite.class)
private void updatePostalAddress() {
address.updatePostalAddress("Evergreen Terrace" , "Springfield");
}
@Interleave(ConcurrencyTestReadWrite.class)
private void read() {
readAddress = address.toString();
}
@Test
public void test() throws InterruptedException {
Thread first = new Thread( () -> { updatePostalAddress(); } ) ;
Thread second = new Thread( () -> { read(); } ) ;
first.start();
second.start();
first.join();
second.join();
assertTrue( "readAddress:" + readAddress ,
readAddress.equals(
"street=E. Bonanza St.,city=South Park,phoneNumber=456 77 99") ||
readAddress.equals(
"street=Evergreen Terrace,city=Springfield,phoneNumber=456 77 99") );
}
}
I need two threads to test if the class is thread-safe, created in lines 15 and 16. I start those two threads, line 17 and 18. And then, I waited until both ended using thread join, in lines 19 and 20. After both threads are stopped, I check to see if the read address equals either the value before or after the update, lines 21 through 25.
To test all thread interleavings, I use the annotation Interleave, lines 5 and 9, from vmlens. The Interleave annotation tells vmlens to test all thread interleavings for the annotated method. Running the test, we see the following error:
java.lang.AssertionError: readAddress:
street=Evergreen Terrace,city=South Park,phoneNumber=456 77 99
We read a mixture between the initial address, e.g. the city South Park and the updated address e.g. the street Evergreen Terrace. To see what went wrong, let us look at the vmlens report:
So first, the writing thread (thread id 13) updates the street. Then, the reading thread (thread id 14) reads the street, city, and phone number, thereby reading the already updated street with the initial city.
CopyOnWrite
To solve this bug, I use the CopyOnWrite
technique. The idea is to create a new copy of the object when writing. Then, we need to change the values in the newly created object and publish the copied object. Since I need to copy the object, I can make it immutable. The address — using the CopyOnWrite
technique — then consists of the following two classes:
First, the immutable class is used to represent the current address:
public class AddressValue {
private final String street;
private final String city;
private final String phoneNumber;
public AddressValue(String street, String city,
String phoneNumber) {
super();
this.street = street;
this.city = city;
this.phoneNumber = phoneNumber;
}
public String getStreet() {
return street;
}
public String getCity() {
return city;
}
public String getPhoneNumber() {
return phoneNumber;
}
}
Second, the mutable class is used to implement the CopyOnWrite
technique:
public class AddressUsingCopyOnWrite {
private volatile AddressValue addressValue;
private final Object LOCK = new Object();
@Override
public String toString() {
AddressValue local = addressValue;
return "street=" + local.getStreet() +
",city=" + local.getCity() +
",phoneNumber=" + local.getPhoneNumber();
}
public AddressUsingCopyOnWrite(String street, String city, String phone) {
this.addressValue = new AddressValue( street, city, phone);
}
public void updatePostalAddress(String street ,String city ) {
synchronized(LOCK){
addressValue = new AddressValue(
street, city, addressValue.getPhoneNumber() );
}
}
public void updatePhoneNumber( String phoneNumber) {
synchronized(LOCK){
addressValue = new AddressValue(
addressValue.getStreet(), addressValue.getCity(), phoneNumber );
}
}
}
An update now consists of creating a new copy of AddressValue
, lines 16 and 17, for updating the postal address, and lines 22 and 23 to update the phone number.
Using those two classes, the tests succeed, making the address thread-safe.
Using a Local Variable When Reading
As you see in the toString
method, I store the addressValue
variable in the local variable local, line 6. Why?
Let us see what happens when we directly access the variable addressValue
instead of using a local variable:
public String toStringNotThreadSafe() {
return "street=" + addressValue.getStreet() +
",city=" + addressValue.getCity() +
",phoneNumber=" + addressValue.getPhoneNumber();
}
Running the test, we see the following error:
java.lang.AssertionError: readAddress:
street=E. Bonanza St.,city=Springfield,phoneNumber=456 77 99
So we, again, read an inconsistent address. We can see in the vmlens report of what went wrong:
The reading thread (thread id 14) first reads the variable addressValue
to get the street. Then, the writing thread (thread id 14) updates the variable addressValue
. Now, the reading threads reads the variable addressValue
to get the city and phone number. So, the reading thread partially reads the initial and partially updated address.
Using a Synchronized Block When Writing
The second part to make the copy on write technique thread-safe is a synchronized block when we write to the variable addressValue
. Why?
Let us see what happens when we remove the synchronized block
public void updatePostalAddress(String street ,String city ) {
addressValue = new AddressValue( street, city,
addressValue.getPhoneNumber() );
}
public void updatePhoneNumber( String phone) {
addressValue = new AddressValue( addressValue.getStreet(),
addressValue.getCity(), phone );
}
While running the test, we see the following:
[INFO] BUILD SUCCESS
No error. The test still succeeds.
To see why we need the synchronized block, we need a different test. We need to test what happens when we update different parts of our address from different threads. So, we use the following test:
public class ConcurrencyTestTwoWrites {
private final AddressUsingCopyOnWriteWithoutSynchronized address =
new AddressUsingCopyOnWriteWithoutSynchronized("E. Bonanza St."
, "South Park" , "456 77 99");
@Interleave(ConcurrencyTestTwoWrites.class)
private void updatePostalAddress() {
address.updatePostalAddress("Evergreen Terrace" , "Springfield");
}
@Interleave(ConcurrencyTestTwoWrites.class)
private void updatePhoneNumber() {
address.updatePhoneNumber("99 55 2222");
}
@Test
public void test() throws InterruptedException {
Thread first = new Thread( () -> { updatePostalAddress();} ) ;
Thread second = new Thread( () -> { updatePhoneNumber(); } ) ;
first.start();
second.start();
first.join();
second.join();
assertEquals( "street=Evergreen Terrace,
city=Springfield,phoneNumber=99 55 2222" ,
address.toString() );
}
}
In this test, the first thread updates the postal address, line 15, and the second thread updates the phone number, line 16. After both threads are stopped, I check to see if the read address contains the new phone number and postal address, lines 21 through 23.
If we run this test, we see the following error:
org.junit.ComparisonFailure:
expected:<...ngfield,phoneNumber=[99 55 2222]>
but was:<...ngfield,phoneNumber=[456 77 99]>
The problem is that without synchronization, a thread overrides the update from another thread, leading to a race condition. By surrounding every write
to the variable addressValue,
we avoid this race and this test also succeeds.
Comparison to Read-Write Locks
Using CopyOnWrite
, only writing threads get blocked by other writing threads. All other combinations are non-blocking. So, reading threads get never blocked and writing threads are not blocked by a reading thread.
Compare this to read-write locks where reading threads get blocked by writing threads. And where writing threads not only get blocked by other writing threads but also by reading threads.
Conclusion
CopyOnWrite
lets us update a class in a thread-safe way. The main advantage of this technique is that reading threads never block and writing threads only get blocked by other writing threads.
When you use this technique, make sure you always use a local variable when reading and a synchronized block when writing.
Further Reading
A Bird's Eye View on Java Concurrency Frameworks
[DZone Refcard] Core Java Concurrency
Published at DZone with permission of Thomas Krieger, DZone MVB. See the original article here.
Opinions expressed by DZone contributors are their own.
Comments