What's Wrong in Java 8, Part II: Functions & Primitives
Tony Hoare called the invention of the null reference the “billion dollars mistake”. May be the use of primitives in Java could be called the million dollars mistake. Primitives where created for one reason: performance. Primitives have nothing to do in an Object language. Introduction of auto boxing/unboxing was a good thing, but much more should have been done. It probably will be done (it is sometimes said to be on the Java 10 road map). In the meanwhile, we have to deal with primitives, and this is a hassle, specially when using functions. Functions in Java 5/6/7 Before Java 8, one could create functions like this: public interface Function { U apply(T t); } Function addTax = new Function() { @Override public Integer apply(Integer x) { return x / 100 * (100 + 10); } }; System.out.println(addTax.apply(100)); This code produces the following result: 110 What Java 8 gives us is the Function interface and the lambda syntax. We do not need anymore to define our own functional interface, and we may use the following syntax: Function addTax = x -> x / 100 * (100 + 10); System.out.println(addTax.apply(100)); Note that in the first example, we used an anonymous class to create a named function. In the second example, using the lambda syntax does not change anything about this. There is still an anonymous class, and a named function. One interesting question is “What is the type of x?” The type was manifest in the first example. Here, it is inferred because of the type of the function. Java knows the function argument type is an Integer because the type of the function is explicitly Function. The first Integer is the type of the argument, and the second Integer is the return type. Boxing is automatically used to convert int to Integer and back as needed. More on this later. Could we use an anonymous function? Yes, but we would have a problem with type. This does not work: System.out.println((x -> x / 100 * (100 + 10)).apply(100)); which means we can't substitute the identifier addTax with its value (the addTax function). We have to restore the type information that is now missing because Java 8 is simply not able to infer the type in this case. The most visible thing which has no explicit type here is the identifier x. So we might try: System.out.println((Integer x) -> x / 100 * 100 + 10).apply(100)); After all, int the first example, we could have written: Function addTax = (Integer x) -> x / 100 * 100 + 10; so it should be enough for Java to infer the type. But this does not work. What we have to do is specifying the type of the function. Specifying the type of its argument is not enough, even if the return type may be inferred. And there is a serious reason for this: Java 8 does not know anything about functions. Functions are ordinary object with ordinary methods that we may call. Nothing more. So we have to specify the type like this: System.out.println(((Function) x -> x / 100 * 100 + 10).apply(100)); Otherwise, it could translate to: System.out.println(((Whatever) x -> x / 100 * 100 + 10).whatever(100)); So the lambda is only syntactic sugar to simplify the Function (or Whatever) interface implementation by an anonymous class. It has in fact absolutely nothing to do with functions. Should Java had only the Function interface with its apply method, this would not be a big deal. But what about primitives? The Function interface would be fine if Java was an object language. But it is not. It is only vaguely oriented toward the use of objects (hence the name Object Oriented). The most important types in Java are the primitives. And primitives do not fit well in OOP. Auto boxing has been introduced in Java 5 to help us deal with this problem, but auto boxing as severe limitations in terms of performance, and this is related to how thing are evaluated in Java. Java is a strict language, so eager evaluation is the rule. The consequence is that each time we have a primitive and need an object, the primitive has to be boxed. And each time we have an object and need a primitive, it has to be unboxed. If we rely upon automatic boxing an unboxing, we may end with much overhead for multiple boxing and unboxing. Other languages have solved this problem differently, allowing only objects and dealing with conversion in the background. They may have “value classes”, which are objects that are backed with primitives. With this functionality, programmers only use objects and the compiler only use primitives (this is over simplified, but it gives an idea of the principle). By allowing programmers to explicitly manipulate primitives, Java makes things much more difficult and much less safe, because programmers are encouraged to use primitives as business types, which is total nonsense either in OOP or in FP. (I will come back to this in another article.) Let's say it abruptly: we should not care about the overhead of boxing and unboxing. If Java programs using this feature are too slow, the language should be fixed. We should not use bad programming techniques to work around language weaknesses. By using primitives, we make the language work against us, and not for us. If this problem is not solved through fixing the language, we should just use another language. But we probably can't for lot of bad reasons, the most important being that we are payed to program in Java and not in any other language. The result is that instead of solving business problems, we find ourselves solving Java problems. And using primitives is a Java problem, and a big one. Lets rewrite our example using primitives instead of objects. Our function takes an argument of type Integer and returns an Integer. To replace this, Java has the type IntUnaryOperator. Wow, this smells! And guess what, it is defined as: public interface IntUnaryOperator { int applyAsInt(int operand); ... } It would probably have been too simple to call the method apply. So, our example using primitives may be rewritten as: IntUnaryOperator addTax = x -> x / 100 * (100 + 10); System.out.println(addTax.applyAsInt(100)); or, using an anonymous function: System.out.println(((IntUnaryOperator) x -> x / 100 * (100 + 10)).applyAsInt(100)); If only for functions of int returning int, this would be simple. But it is much more complex. Java 8 has 43 (functional) interfaces in the java.util.function package. In reality, they do not all represent functions. They can be grouped as follows: 21 one argument functions, among which 2 are functions of object returning object and 19 are various cases of object to primitive and primitive to object functions. One of the two object to object functions is for the specific case when both argument and return value are of the same type. 9 two arguments functions, among which 2 are functions of (object, object) to object, and 7 are various cases of (object, object) to primitive or (primitive, primitive) to primitive. 7 are effects, and not functions, since they do not return any value and are supposed to be used only for their side effect. (It's somewhat strange to call these “functional interfaces”.) 5 are “suppliers”, which means functions that do not take an argument but return a value. These could be functions. In the functional world, these are special functions called nullary functions (to indicate that their arity, or number of arguments, is zero). As functions, their return value may never change, so they allow treating constants as functions. In Java 8, their role is to depend upon mutable context to return variable values. So, they are not functions. What a mess! And furthermore, the methods of these interfaces have different names. Object functions have a method named apply, where methods returning numeric primitives have method name applyAsInt, applyAsLong, or applyAsDouble. Functions returning boolean have a method called test, and suppliers have methods called get, or getAsInt, getAsLong, getAsDouble, or getAsBoolean. (They did not dare calling BooleanSupplier “Predicate” with a test method taking no argument. I really wonder why!) One thing to note is that there are no functions for byte, char, short and float. Nor are there functions for arity greater that two. Needless to say, this is totally ridiculous. But we have to stick with it. As long as Java can infer the type, we may think we have no problem. However, if you want to manipulate functions in a functional way, you will soon face the problem of Java being unable to infer a type. Worst, Java will sometime infer the type and stay silent while using a type which is no the one you intended. How to help discovering the right type Let's say we want to use a three arguments function. As there are no such functional interfaces in Java 8, you are left with a choice: create you own functional interface, or use currying, as we have seen in a previous article (What's wrong with Java 8 part I ). Creating a three object arguments functional interface returning object is straightforward: interface Function { R apply(T, t, U, u, V, v); } However, we may face two problems. The first one is that we may need to process primitives. Parametric types will not help us for this. You may create special versions of the function using primitives instead of objects. After all, with eight type of primitives, three arguments and one return value, there are only 6 561 different versions of this function. Why do you think Oracle did not put TriFunction in Java 8? (To be precise, they only put a very limited number of BiFunction where arguments are Object and return type int, long or double, or when argument and return types are of the same type int, long or Object, leading to a total of 9 out of 729 possible.) A much better solution is to use autoboxing. Just use Integer, Long, Boolean and so on and let Java handle this. Doing whatever else would be the root of all evil, i.e. premature optimization (see http://c2.com/cgi/wiki?PrematureOptimization). Another way to go (beside creating three arguments functional interface) is to use currying. This is mandatory if the arguments may not be evaluated at the same time. Furthermore, it allows using only functions of one argument, which limits the number of possible functions to 81. If we restrict ourselves to boolean, int, long and double, the number falls to 25 (four primitive types plus Object in two places equals 5 x 5). The problem is that it may be somewhat difficult to use currying with functions returning primitives or taking primitives as their argument. As an example, here is the same example used in our previous article (What's wrong with Java 8 part I ), but using primitives: IntFunction> intToIntCalculation = x -> y -> z -> x + y * z; private IntStream calculate(IntStream stream, int a) { return stream.map(intToIntCalculation.apply(b).apply(a)); } IntStream stream = IntStream.of(1, 2, 3, 4, 5); IntStream newStream = calculate(stream, 3); Note that the result is not “a stream containing the values 5, 8, 11, 14 and 17”, no more than the initial stream would have contained the value 1, 2, 3, 4 and 5. newStream in not evaluated at this stage, so it does not contain values. (We'll talk about this in a next article). To see the result, we have to evaluate the stream, which may be forced by binding it to a terminal operation. This may be done through a call to the collect method. But before doing this, we will bind the result to one more non terminal function using the method boxed. The boxed methods binds to the stream a function converting primitives to the corresponding objects. This will simplify evaluation: System.out.println(newStream.boxed().collect(toList())); This prints: [5, 8, 11, 14, 17] We could as well use an anonymous function. However, Java is not be able to infer the type, so we must help it: private IntStream calculate(IntStream stream, int a) { return stream.map(((IntFunction>) x -> y -> z -> x + y * z).apply(b).apply(a)); } IntStream stream = IntStream.of(1, 2, 3, 4, 5); IntStream newStream = calculate(stream, 3); Currying in itself is very easy. Just remember, as I said in a previous article, that: (x, y, z) -> w translates to x -> y -> z -> w Finding the right type is slightly more complicated. You have to remember that each time you apply an argument, you are returning a function, so you need a function from the type of the argument to an object type (because functions are objects). Here, each argument is of type int, so we need to use IntFunction parameterized with the type of the returned function. As the final type is IntUnaryOperator (as required by the map method of the IntStream class), the result is: IntFunction>> Here, we are applying two of the three parameters and all parameters are of type int, so the type is: IntFunction> This may be compared to the version using autoboxing: Function>> If you have problems determining the right type, start with the version using autoboxing, just replacing the final type you know you need (since it is the type of the argument of map): Function> Note that you may perfectly use this type in your program: private IntStream calculate(IntStream stream, int a) { return stream.map(((Function>) x -> y -> z -> x + y * z).apply(b).apply(a)); } IntStream stream = IntStream.of(1, 2, 3, 4, 5); IntStream newStream = calculate(stream, 3); You may then replace each Function>) x -> y -> z -> x + y * z).apply(b).apply(a)); } and then to: private IntStream calculate(IntStream stream, int a) { return stream.map(((IntFunction>) x -> y -> z -> x + y * z).apply(b).apply(a)); } Note that all three versions compile and run. The only difference is whether autoboxing is used or not. When to be anonymous So, as we saw in the examples above, lambdas are very good at simplifying anonymous class creation, but there is rarely good reason not to name the instance that is created. Naming functions allows: function reuse function testing function replacement program maintenance program documentation Naming function plus currying will make your function completely independent from the environment (“referential transparency”), making you programs safer and more modular. There is however a difficulty. Using primitives makes it difficult to figure the type of curried function. And worst, primitive are not the right business types to use, so the compiler will not be able to help you in this area. To see why, look at this example: double tax = 10.24; double limit = 500.0; double delivery = 35.50; DoubleStream stream3 = DoubleStream.of(234.23, 567.45, 344.12, 765.00); DoubleStream stream4 = stream3.map(x -> { double total = x / 100 * (100 + tax); if ( total > limit) { total = total + delivery; } return total; }); To replace the anonymous “capturing” function by a named curried one, determining the correct type is not so difficult. There will be four arguments and it will return a DoubleUnaryOperator, so the type will be DoubleFunction>>. However, it is very easy to misplace the arguments: DoubleFunction>> computeTotal = x -> y -> z -> w -> { double total = w / 100 * (100 + x); if (total > y) { total = total + z; } return total; }; DoubleStream stream2 = stream.map(computeTotal.apply(tax).apply(limit).apply(delivery)); How can you be sure what x, y, z and w are ? There is in fact a simple rule: the arguments that are evaluated through the explicit use of the apply method come first, in the order they are applied, i.e. tax, limit, delivery, corresponding to x, y and z. The argument coming from the stream is applied last, so it corresponds to w. However, we are still having a problem: once the function is tested, we now that it is correct, but there is no way to be sure it will be used right. For example if we apply the parameters in the wrong order: DoubleStream stream2 = stream.map(computeTotal.apply(limit).apply(tax).apply(delivery)); we get [1440.8799999999999, 3440.2000000000003, 2100.2200000000003, 4625.5] instead of: [258.215152, 661.05688, 379.357888, 878.836] This means we have to test not only the function, but each use of it. Wouldn't it be nice if we could be sure that using the parameters in the wrong order would not compile? This is what using the right type system is about. Using primitives for business types is not good. It has never be. But now, with functions, we have one more reason not to do this. This will be the subject of another article. What's next We have seen how using primitives is somewhat more complicated that using objects. Functions using primitives are a real mess in Java 8. But the worst is to come. In a next article, we will talk about using primitives with streams.
May 5, 2014
by Pierre-Yves Saumont
·
51,595 Views
·
10 Likes