Java Optional Objects
In this post I present several examples of the new Optional objects in Java 8 and I make comparisons with similar approaches in other programming languages, particularly the functional programming language SML and the JVM-based programming language Ceylon, this latter currently under development by Red Hat. I think it is important to highlight that the introduction of optional objects has been a matter of debate. In this article I try to present my perspective of the problem and I do an effort to show arguments in favor and against the use of optional objects. It is my contention that in certain scenarios the use of optional objects is valuable, but ultimately everyone is entitled to an opinion and I just hope this article helps the readers to make an informed one just as writing it helped me understand this problem much better. About the Type of Null In Java we use a reference type to gain access to an object, and when we don't have a specific object to make our reference point to, then we set such reference to null to imply the absence of a value. In Java null is actually a type, a special one: it has no name, we cannot declare variables of its type, or cast any variables to it, in fact there is a single value that can be associated with it (i.e. the literal null), and unlike any other types in Java, a null reference can be safely assigned to any other reference types (See JLS 3.10.7 and 4.1). The use of null is so common that we rarely meditate on it: field members of objects are automatically initialized to null and programmers typically initialize reference types to null when they don't have an initial value to give them and, in general, null is used everywhere to imply that, at certain point, we don't know or we don't have a value to give to a reference. About the Null Pointer Reference Problem Now, the major problem with the null reference is that if we try to dereference it then we get the ominous and well known NullPointerException. When we work with a reference obtained from a different context than our code (i.e. as the result of a method invocation or when we receive a reference as an argument in a method we are working on), we all would like to avoid this error that has the potential to make our application crash, but often the problem is not noticed early enough and it finds its way into production code where it waits for the right moment to fail (which is typically a Friday at the end of the month, around 5 p.m. and just when you are about to leave the office to go to the movies with your family or drink some beers with your friends). To make things worse, the place where your code fails is rarely the place where the problem originated, since your reference could have been set to null far away from the place in your code where you intended to dereference it. So, you better cancel those plans for the Friday night... It's worth mentioning that this concept of null references was first introduced by Tony Hoare, the creator of ALGOL, back in 1965. The consequences were not so evident in those days, but he later regretted his design and he called it "a billion dollars mistake", precisely referring to the uncountable amount of hours that many of us have spent, since then, fixing this kind null dereferencing problems. Wouldn't it be great if the type system could tell the difference between a reference that, in a specific context, could be potentially null from one that couldn't? This would help a lot in terms of type safety because the compiler could then enforce that the programmer do some verification for references that could be null at the same time that it allows a direct use of the others. We see here an opportunity for improvement in the type system. This could be particularly useful when writing the public interface of APIs because it would increase the expressive power of the language, giving us a tool, besides documentation, to tell our users that a given method may or may not return a value. Now, before we delve any further, I must clarify that this is an ideal that modern languages will probably pursue (we'll talk about Ceylon and Kotlin later), but it is not an easy task to try to fix this hole in a programming language like Java when we intend to do it as an afterthought. So, in the coming paragraphs I present some scenarios in which I believe the use of optional objects could arguably alleviate some of this burden. Even so, the evil is done, and nothing will get rid of null references any time soon, so we better learn to deal with them. Understanding the problem is one step and it is my opinion that these new optional objects are just another way to deal with it, particularly in certain specific scenarios in which we would like to express the absence of a value. Finding Elements There is a set of idioms in which the use of null references is potentially problematic. One of those common cases is when we look for something that we cannot ultimately find. Consider now the following simple piece of code used to find the first fruit in a list of fruits that has a certain name: public static Fruit find(String name, List fruits) { for(Fruit fruit : fruits) { if(fruit.getName().equals(name)) { return fruit; } } return null; } As we can see, the creator of this code is using a null reference to indicate the absence of a value that satisfies the search criteria (7). It is unfortunate, though, that it is not evident in the method signature that this method may not return a value, but a null reference.. Now consider the following code snippet, written by a programmer expecting to use the result of the method shown above: List fruits = asList(new Fruit("apple"), new Fruit("grape"), new Fruit("orange")); Fruit found = find("lemon", fruits); //some code in between and much later on (or possibly somewhere else)... String name = found.getName(); //uh oh! Such simple piece of code has an error that cannot be detected by the compiler, not even by simple observation by the programmer (who may not have access to the source code of the find method). The programmer, in this case, has naively failed to recognize the scenario in which the find method above could return a null reference to indicate the absence of a value that satisfies his predicate. This code is waiting to be executed to simply fail and no amount of documentation is going to prevent this mistake from happening and the compiler will not even notice that there is a potential problem here. Also notice that the line where the reference is set to null (5) is different from the problematic line (7). In this case they were close enough, in other cases this may not be so evident. In order to avoid the problem what we typically do is that we check if a given reference is null before we try to dereference it. In fact, this verification is quite common and in certain cases this check could be repeated so many times on a given reference that Martin Fowler (renown for hist book on refactoring principles) suggested that for these particular scenarios such verification could be avoided with the use of what he called a Null Object. In our example above, instead of returning null, we could have returned a NullFruit object reference which is an object of type Fruit that is hollowed inside and which, unlike a null reference, is capable of properly responding to the same public interface of a Fruit. Minimum and Maximum Another place where this could be potentially problematic is when reducing a collection to a value, for instance to a maximum or minimum value. Consider the following piece of code that can be used to determine which is the longest string in a collection. public static String longest(Collection items) { if(items.isEmpty()){ return null; } Iterator iter = items.iterator(); String result = iter.next(); while(iter.hasNext()) { String item = iter.next(); if(item.length() > result.length()){ result = item; } } return result; } In this case the question is what should be returned when the list provided is empty? In this particular case a null value is returned, once again, opening the door for a potential null dereferencing problem. The Functional World Strategy It's interesting that in the functional programming paradigm, the statically-typed programming languages evolved in a different direction. In languages like SML or Haskell there is no such thing as a null value that causes exceptions when dereferenced. These languages provide a special data type capable of holding an optional value and so it can be conveniently used to also express the possible absence of a value. The following piece of code shows the definition of the SML option type: datatype 'a option = NONE | SOME of 'a As you can see, option is a data type with two constructors, one of them stores nothing (i.e. NONE) whereas the other is capable of storing a polymorphic value of some value type 'a (where 'a is just a placeholder for the actual type). Under this model, the piece of code we wrote before in Java, to find a fruit by its name, could be rewritten in SML as follows: fun find(name, fruits) = case fruits of [] => NONE | (Fruit s)::fs => if s = name then SOME (Fruit s) else find(name,fs) There are several ways to achieve this in SML, this example just shows one way to do it. The important point here is that there is no such thing as null, instead a value NONE is returned when nothing is found (3), and a value SOME fruit is returned otherwise (5). When a programmer uses this find method, he knows that it returns an option type value and therefore the programmer is forced to check the nature of the value obtained to see if it is either NONE (6) or SOME fruit (7), somewhat like this: let val fruits = [Fruit "apple", Fruit "grape", Fruit "orange"] val found = find("grape", fruits) in case found of NONE => print("Nothing found") | SOME(Fruit f) => print("Found fruit: " ^ f) end Having to check for the true nature of the returned option makes it impossible to misinterpret the result. Java Optional Types It's a joy that finally in Java 8 we'll have a new class called Optional that allows us to implement a similar idiom as that from the functional world. As in the case of of SML, the Optional type is polymorphic and may contain a value or be empty. So, we could rewrite our previous code snippet as follows: public static Optional find(String name, List fruits) { for(Fruit fruit : fruits) { if(fruit.getName().equals(name)) { return Optional.of(fruit); } } return Optional.empty(); } As you can see, the method now returns an Optional reference (1), if something is found, the Optional object is constructed with a value (4), otherwise is constructed empty (7). And the programmer using this code would do something as follows: List fruits = asList(new Fruit("apple"), new Fruit("grape"), new Fruit("orange")); Optional found = find("lemon", fruits); if(found.isPresent()) { Fruit fruit = found.get(); String name = fruit.getName(); } Now it is made evident in the type of the find method that it returns an optional value (5), and the user of this method has to program his code accordingly (6-7). So we see that the adoption of this functional idiom is likely to make our code safer, less prompt to null dereferencing problems and as a result more robust and less error prone. Of course, it is not a perfect solution because, after all, Optional references can also be erroneously set to null references, but I would expect that programmers stick to the convention of not passing null references where an optional object is expected, pretty much as we today consider a good practice not to pass a null reference where a collection or an array is expected, in these cases the correct is to pass an empty array or collection. The point here is that now we have a mechanism in the API that we can use to make explicit that for a given reference we may not have a value to assign it and the user is forced, by the API, to verify that. Quoting an article I reference later about the use of optional objects in the Guava Collections framework: "Besides the increase in readability that comes from giving null a name, the biggest advantage of Optional is its idiot-proof-ness. It forces you to actively think about the absent case if you want your program to compile at all, since you have to actively unwrap the Optional and address that case". Other Convenient Methods As of the today, besides the static methods of and empty explained above, the Optional class contains the following convenient instance methods: ifPresent() Which returns true if a value is present in the optional. get() Which returns a reference to the item contained in the optional object, if present, otherwise throws a NoSuchElementException. ifPresent(Consumer consumer) Which passess the optional value, if present, to the provided Consumer (which could be implemented through a lambda expression or method reference). orElse(T other) Which returns the value, if present, otherwise returns the value in other. orElseGet(Supplier other) Which returns the value if present, otherwise returns the value provided by the Supplier (which could be implemented with a lambda expression or method reference). orElseThrow(Supplier exceptionSupplier) Which returns the value if present, otherwise throws the exception provided by the Supplier (which could be implemented with a lambda expression or method reference). Avoiding Boilerplate Presence Checks We can use some of the convenient methods mentioned above to avoid the need of having to check if a value is present in the optional object. For instance, we may want to use a default fruit value if nothing is found, let's say that we would like to use a "Kiwi". So we could rewrite our previous code like this: Optional found = find("lemon", fruits); String name = found.orElse(new Fruit("Kiwi")).getName(); In this other example, the code prints the fruit name to the main output, if the fruit is present. In this case, we implement the Consumer with a lambda expression. Optional found = find("lemon", fruits); found.ifPresent(f -> { System.out.println(f.getName()); }); This other piece of code uses a lambda expression to provide a Supplier which can ultimately provide a default answer if the optional object is empty: Optional found = find("lemon", fruits); Fruit fruit = found.orElseGet(() -> new Fruit("Lemon")); Clearly, we can see that these convenient methods simplify a lot having to work with the optional objects. So What's Wrong with Optional? The question we face is: will Optional get rid of null references? And the answer is an emphatic no! So, detractors immediately question its value asking: then what is it good for that we couldn't do by other means already? Unlike functional languages like SML o Haskell which never had the concept of null references, in Java we cannot simply get rid of the null references that have historically existed. This will continue to exist, and they arguably have their proper uses (just to mention an example: three-valued logic). I doubt that the intention with the Optional class is to replace every single nullable reference, but to help in the creation of more robust APIs in which just by reading the signature of a method we could tell if we can expect an optional value or not and force the programmer to use this value accordingly. But ultimately, Optional will be just another reference and subject to same weaknesses of every other reference in the language. It is quite evident that Optional is not going to save the day. How these optional objects are supposed to be used or whether they are valuable or not in Java has been the matter of a heated debate in the project lambda mailing list. From the detractors we hear interesting arguments like: The fact that other alternatives exist ( i.e. the Eclipse IDE supports a set of proprietary annotations for static analysis of nullability, the JSR-305 with annotations like @Nullable and @NonNull). Some would like it to be usable as in the functional world, which is not entirely possible in Java since the language lacks many features existing in functional programming languages like SML or Haskell (i.e. pattern matching). Others argue about how it is impossible to retrofit preexisting code to use this idiom (i.e. List.get(Object)which will continue to return null). And some complain about the fact that the lack of language support for optional values creates a potential scenario in which Optional could be used inconsistently in the APIs, by this creating incompatibilities, pretty much like the ones we will have with the rest of the Java API which cannot be retrofitted to use the new Optional class. A compelling argument is that if the programmer invokes the get method in an optional object, if it is empty, it will raise a NoSuchElementException, which is pretty much the same problem that we have with nulls, just with a different exception. So, it would appear that the benefits of Optional are really questionable and are probably constrained to improving readability and enforcing public interface contracts. Optional Objects in the Stream API Irrespective of the debate, the optional objects are here to stay and they are already being used in the new Stream API in methods like findFirst, findAny, max and min. It could be worth mentioning that a very similar class has been in used in the successful Guava Collections Framework. For instance, consider the following example where we extract from a stream the last fruit name in alphabetical order: Stream fruits = asList(new Fruit("apple"), new Fruit("grape")).stream(); Optional max = fruits.max(comparing(Fruit::getName)); if(max.isPresent()) { String fruitName = max.get().getName(); //grape } Or this another one in which we obtain the first fruit in a stream Stream fruits = asList(new Fruit("apple"), new Fruit("grape")).stream(); Optional first = fruits.findFirst(); if(first.isPresent()) { String fruitName = first.get().getName(); //apple } Ceylon Programming Language and Optional Types Recently I started to play a bit with the Ceylon programming language since I was doing a research for another post that I am planning to publish soon in this blog. I must say I am not a big fan of Ceylon, but still I found particularly interesting that in Ceylon this concept of optional values is taken a bit further, and the language itself offers some syntactic sugar for this idiom. In this language we can mark any type with a ? (question mark) in order to indicate that its type is an optional type. For instance, this find function would be very similar to our original Java version, but this time returning an optional Fruit? reference (1). Also notice that a null value is compatible with the optional Fruit? reference (7). Fruit? find(String name, List fruits){ for(Fruit fruit in fruits) { if(fruit.name == name) { return fruit; } } return null; } And we could use it with this Ceylon code, similar to our last Java snippet in which we used an optional value: List fruits = [Fruit("apple"),Fruit("grape"),Fruit("orange")]; Fruit? fruit = find("lemon", fruits); print((fruit else Fruit("Kiwi")).name); Notice the use of the else keyword here is pretty similar to the method orElse in the Java 8 Optional class. Also notice that the syntax is similar to the declaration of C# nullable types, but it means something totally different in Ceylon. It may be worth mentioning that Kotlin, the programming language under development by Jetbrains, has a similar feature related to null safety (so maybe we are before a trend in programming languages). An alternative way of doing this would have been like this: List fruits = [Fruit("apple"),Fruit("grape"),Fruit("orange")]; Fruit? fruit = find("apple", fruits); if(exists fruit){ String fruitName = fruit.name; print("The found fruit is: " + fruitName); } //else... Notice the use of the exists keyword here (3) serves the same purpose as the isPresent method invocation in the Java Optional class. The great advantage of Ceylon over Java is that they can use this optional type in the APIs since the beginning, within the realm of their language they won't have to deal with incompatibilities, and it can be fully supported everywhere (perhaps their problem will be in their integration with the rest of the Java APIs, but I have not studied this yet). Hopefully, in future releases of Java, this same syntactic sugar from Ceylon and Kotlin will also be made available in the Java programming language, perhaps using, under the hood, this new Optional class introduced in Java 8. Further Reading Java Language Specification The Billion Dollars Mistake Refactoring Catalog Ceylon Programming Language Kotlin Programming Language C# Nullable Types Avoid Using Null (Guava Framework) More Discussion on Java's Optional Java Infinite Streams Java Streams API Preview Java Streams Preview vs .Net High-Order Programming with LINQ
April 17, 2013
by Edwin Dalorzo
·
84,697 Views
·
16 Likes