Java Records: A Closer Look
Check out this deep look into Java Records for declaring classes and see how it compares to Kotlin's Data Class and Scala's Case Class features.
Join the DZone community and get the full member experience.
Join For FreeRecords is a new preview feature in Java 14, providing a nice compact syntax to declare classes that are supposed to be dumb data holders. In this article, we’re going to see what Records looks like under the hood. So buckle up!
Class Representation
Let’s start with a very simple example:
How about compiling this code using javac
:
Then, it’s possible to take a peek at the generated bytecode using javap
:
This will print the following:
Interestingly, similar to Enums, Records are normal Java classes with a few fundamental properties:
- They are declared as
final
classes, so we can’t inherit from them. - They’re already inheriting from another class named
java.lang.Record
. Therefore, Records can’t extend any other class, as Java does not allow multiple-inheritance. - Records can implement other interfaces.
- For each component, there is an accessor method, e.g.
max
andmin
. - There are auto-generated implementations for
toString
,equals
andhashCode
based on all components. - Finally, there is an auto-generated constructor that accepts all components as its arguments.
Also, the java.lang.Record
is just an abstract class with a protected no-arg constructor and a few other basic abstract methods:
Nothing special about this class!
You may also enjoy: Introducing Java Record
The Curious Case of Data Classes
Coming from a Kotlin or Scala background, one may spot some similarities between Records in Java, Data Classes in Kotlin and Case Classes in Scala. On the surface, they all share one very fundamental goal: to facilitate writing data holders.
Despite this fundamental similarity, things are very different at the bytecode level.
Kotlin’s Data Class
For the sake of comparison, let’s see a Kotlin data class equivalent of Range
:
Similar to Records, Kotlin compiler generates accessor methods, default toString
, equals
, and hashCode
implementations and a few more functions based on this simple one-liner.
Let’s see how the Kotlin compiler generates the code for, say, toString
:
We issued the javap -c -v Range
to generate this output. Also, here we’re using the simple class names for the sake of brevity.
Anyway, Kotlin is using the StringBuilder
to generate the string representation instead of multiple string concatenations (like any decent Java developer!). That is:
- At first, it creates a new instance of
StringBuilder
(index 0, 3, 4). - Then it appends the literal
Range(min=
string (index 7, 9). - Then it appends the actual min value (index 12, 13, 16).
- Then it appends the literal
, max=
(index 19, 21). - Then it appends the actual max value (index 24, 25, 28).
- Then it closes the parentheses by appending the
)
literal (index 31, 33). - Finally, it builds the
StringBuilder
instance and returns it (index 36, 39).
Basically, the more we have properties in our data class, the lengthier the bytecode and consequently longer startup time.
Scala’s Case Class
Let’s write the case class
equivalent in Scala:
At first glance, Scala seems to generate a much simpler toString
implementation:
However, the toString
calls the scala.runtime.ScalaRunTime._toString
static method. That, in turn, calls the productIterator
method to iterate through this Product Type. This iterator calls the productElement
method, which looks like:
This basically switches over all properties of the case class
. For instance, if the productIterator
wants the first property, it returns the min
. Also, when the productIterator
wants the second element, it will return the max
value. Otherwise, it will throw an instance of IndexOutOfBoundsException
to signal an out of bound request.
Again, the more we have properties in a case class
, we would have more of those switch arms. Therefore, the bytecode length is proportional to the number of properties. In other words, the same problem as Kotlin’s data class
.
Invoke Dynamic
Let’s take an even closer look to the bytecode generated for the Java Records:
Regardless of the number of record components, this will be the bytecode. A simple, polished and elegant solution. But how does this invokedynamic
thing work?
Introducing Indy
Invoke Dynamic (also known as Indy) was part of JSR 292 intending to enhance the JVM support for dynamic languages. After its first release in Java 7, the invokedynamic
opcode along with its java.lang.invoke
luggage is used quite extensively by dynamic JVM-based languages like JRuby.
Although Indy was specifically designed to enhance the dynamic language support, it offers much more than that. As a matter of fact, it’s suitable to use wherever a language designer needs any form of dynamicity, from dynamic type acrobatics to dynamic strategies! For instance, the Java 8 Lambda Expressions are actually implemented using invokedynamic
, even though Java is a statically typed language!
User-Definable Bytecode
For quite some time JVM did support four method invocation types: invokestatic
to call static methods, invokeinterface
to call interface methods, invokespecial
to call constructors, super()
or private
methods and invokevirtual
to call instance methods.
Despite their differences, these invocation types share one common trait: we can’t enrich them with our own logic. On the contrary, invokedynamic
enables us to Bootstrap the invocation process in any way we want. Then the JVM takes care of calling the bootstrapped method directly.
How Does Indy Work?
The first time JVM sees an invokedynamic
instruction, it calls a special static method called Bootstrap Method. The bootstrap method is a piece of Java code that we’ve written to prepare the actual to-be-invoked logic:
Then the bootstrap method returns an instance of java.invoke.CallSite
. This CallSite
holds a reference to the actual method, i.e. MethodHandle
. From now on, every time JVM sees this invokedynamic
instruction again, it skips the Slow Path and directly calls the underlying executable. The JVM continues to skip the slow path unless something changes.
Why Indy?
As opposed to the Reflection APIs, the java.lang.invoke
API is quite efficient since the JVM can completely see through all invocations. Therefore, JVM may apply all sorts of optimizations as long as we avoid the slow path as much as possible!
In addition to the efficiency argument, the invokedynamic
approach is more reliable and less brittle because of its simplicity.
Moreover, the generated bytecode for Java Records is independent of the number of properties. So, less bytecode and faster startup time.
Finally, let’s suppose a new version of Java includes a new and more efficient bootstrap method implementation. With invokedynamic
, our app can take advantage of this improvement without recompilation. This way we have some sort of Forward Binary Compatibility. Also, That’s the dynamic strategy we were talking about!
The Object Methods
Now that we are familiar enough with Indy, let’s make sense of the invokedynamic
in Records bytecode:
Look what I found in the Bootstrap Method Table:
So the bootstrap method for Records is called bootstrap
which resides in the java.lang.runtime.ObjectMethods
class. As you can see, this bootstrap method expects the following parameters:
- An instance of
MethodHandles.Lookup
representing the lookup context (TheLjava/lang/invoke/MethodHandles$Lookup
part). - The method name (i.e.
toString
,equals
,hashCode
, etc.) the bootstrap is going to link. For example, when the value istoString
, bootstrap will return aConstantCallSite
(aCallSite
that never changes) that points to the actualtoString
implementation for this particular Record. - The
TypeDescriptor
for the method (Ljava/lang/invoke/TypeDescriptor
part). - A type token, i.e.
Class<?>
, representing the Record class type. It’sClass<Range>
in this case. - A semi-colon separated list of all component names, i.e.
min;max
. - One
MethodHandle
per component. This way the bootstrap method can create aMethodHandle
based on the components for this particular method implementation.
The invokedynamic
instruction passes all those arguments to the bootstrap method. Bootstrap method, in turn, returns an instance of ConstantCallSite
. This ConstantCallSite
is holding a reference to requested method implementation, e.g. toString
.
Reflecting on Records
The java.lang.Class
API has been retrofitted to support Records. For example, given a Class<?>
, we can check whether it’s a Record or not using the new isRecord
method:
It obviously returns false
for non-record types:
There is, also, a getRecordComponents
method that returns an array of RecordComponent
in the same order they defined in the original record. Each java.lang.reflect.RecordComponent
is representing a record component or variable of the current record type. For example, the RecordComponent.getName
returns the component name:
In the same way the getType
method returns the type-token for each component:
It’s even possible to get a handle to accessor methods via getAccessor
:
Annotating Records
Java allows you to annotate Records, as long as the annotation is applicable to a record or its members. Additionally, there would be a new annotation ElementType
called RECORD_COMPONENT
. Annotations with this target can only be used on record components:
Serialization
Any new Java feature without a nasty relationship with Serialization would be incomplete. This time around, however, the relationship does not sound as unappealing as we’re used to.
Although Records are not by default serializable, it’s possible to make them so just by implementing the java.io.Serializable
marker interface.
Serializable records are serialized and deserialized differently than ordinary serializable objects. The updated javadoc for ObjectInputStream
states that:
- The serialized form of a record object is a sequence of values derived from the record components.
- The process by which record objects are serialized or externalized cannot be customized; any class-specific
writeObject
,readObject
,readObjectNoData
,writeExternal
, andreadExternal
methods defined by record classes are ignored during serialization and deserialization. - The
serialVersionUID
of a record class is0L
unless explicitly declared.
Conclusion
Java Records are going to provide a new way to encapsulate data holders. Even though currently they’re limited in terms of functionality (compared to what Kotlin or Scala are offering), the implementation is solid.
The first preview of Records would be available in March 2020. In this article, we’ve used the openjdk 14-ea 2020-03-17
build, since the Java 14 is yet to be released!
Further Reading
Published at DZone with permission of Ali Dehghani. See the original article here.
Opinions expressed by DZone contributors are their own.
Comments