Java String intern(): Interesting Q and A
The intern() function eliminates duplicate string objects from the application and has the potential to reduce the overall application memory consumption.
Join the DZone community and get the full member experience.
Join For Freeintern()
is an interesting function in the java.lang.String
object. The intern()
function eliminates duplicate string objects from the application and has the potential to reduce the overall memory consumption of your application. In this post, let’s learn more about this intern()
function.
1. How Does the String intern()
Function Work?
In Java heap memory, a pool of string objects is maintained. When you invoke an intern()
function on a string object, JVM will check whether this string object already exists in the pool. If it exists, then that same object is returned back to the invoker. If the string object doesn’t exist, then this string object is added to the pool, and the newly added string object is returned to the invoker.
It’s always easy to learn through examples and pictures. Let’s do it and look at the below code snippet:
1: String s1 = new String("yCrash").intern();
2: String s2 = new String("yCrash").intern();
JVM heap memory when launched initially
All of the objects that your application creates are stored in the JVM’s heap memory. This JVM heap memory internally has a string intern pool. When you launch the program initially, JVM’s heap memory will have no string objects.
JVM heap memory when ‘String s1 = new String(“yCrash”).intern();’ is executed
When the first statement String s1 = new String(“yCrash”).intern();
is executed, JVM will check whether the yCrash
string object is present in the intern string pool. Since it doesn’t exist, this yCrash
string will be added to the intern string pool and this newly created string object’s reference will be returned back to s1
.
JVM heap memory when ‘String s2 = new String(“yCrash”).intern();’ is executed
When the second statement String s2 = new String(“yCrash”).intern();
is executed, JVM will once again check whether the yCrash
string object is present in the intern string pool. This time, yCrash
string object is present in the intern string pool because it was added when statement #1 is executed. Now, this old string object’s reference will be returned to s2
. Both s1
and s2
will be pointing to the same yCrash
string object. Thus, the duplicate string object yCrash
created in statement #2 will be discarded.
2. How String Works Without intern()
Function
1: String s3 = new String("yCrash");
2: String s4 = new String("yCrash");
JVM heap memory when ‘String s3 = new String(“yCrash”);’ is executed
When the first statement String s3 = new String(“yCrash”);
is executed, JVM will add the yCrash
string object to the heap memory, but not within the intern string pool.
JVM heap memory when ‘String s4 = new String(“yCrash”);’ is executed
When the second statement String s4 = new String(“yCrash”);
is executed, JVM will create a new yCrash
string object in the heap memory. Thus duplicate yCrash
will be created in the memory. In case your application is creating n yCrash
objects without invoking intern()
, n yCrash
string objects will be created in the memory. It will lead to a considerable amount of memory wastage.
3. How intern()
and ==
Work
Since s1
and s2
are pointing to the same yCrash
string object, when you invoke the ==
operation between s1
and s2
as shown below, you will get true
as result.
// true will be printed
System.out.println(s1 == s2);
Since s3
and s4
are pointing to two different yCrash
string objects, when you invoke the ==
operation between s3
and s4
as shown below, you will get false
as result.
// false will be printed
System.out.println(s3 == s4);
4. In Which JVM Memory Region intern String
Pool Is Stored
JVM memory has the following regions:
- Heap region (i.e., Young Generation + Old Generation)
- Metaspace
- Others region
Learn about these JVM memory regions. In the earlier versions of Java starting from 1 to 6, the string intern
pool was stored in the Perm Generation. Starting from Java 7, the String intern
pool is stored in the JVM’s heap memory. To confirm it, we conducted this simple experiment.
5. Is It Better To Use intern()
or -XX:+UseStringDeduplication
?
When you pass the -XX:+UseStringDeduplication
JVM argument during application startup, JVM will try to eliminate duplicate strings as part of the garbage collection process. During the garbage collection process, JVM inspects all the objects in memory. As part of this process, it tries to identify duplicate strings among them and tries to eliminate them. However, there are certain limitations to using the -XX:+UseStringDeduplication
JVM argument. For example, it will only work with the G1 GC algorithm and eliminate duplicates only on long-living string objects (learn more about this argument). Here is an interesting case study of a major application that tried to use the -XX:+UseStringDeduplication
JVM argument.
On the other hand, the intern()
function can be used with any GC algorithm and on both short-lived and long-lived objects. However, the intern()
function might impact application response time more than -XX:+UseStringDeduplication
. For more details refer to this post, "Java String intern: Performance impact."
6. What Is the Performance Impact of intern()
Function?
Based on this post, you might have understood that invoking the intern()
function on the string objects has the potential to eliminate duplicate strings from memory, thus reducing overall memory utilization. However, it can have a toll on the response time and CPU utilization. To understand the performance impact of using the intern()
function, once again, refer back to the post "Java String intern: Performance impact" linked in the previous section.
Video
Published at DZone with permission of Ram Lakshmanan, DZone MVB. See the original article here.
Opinions expressed by DZone contributors are their own.
Comments