What Are Reactive Streams in Java?
In this post we take an in-depth look at Reactive programming and streams in Java. With all the hype surrounding reactive systems, this post is worth the read.
Join the DZone community and get the full member experience.
Join For FreeIf you’re following the Java community, you may be hearing about Reactive Streams in Java. It seems like in all the major tech conferences, you’re seeing presentations on Reactive Programming. Last year, the buzz was all about functional programming, this year the buzz is about Reactive Programming.
So, is the attention span of the Java community that short-lived? Have we Java developers forgotten about functional programming and moved on to Reactive programming?
Not exactly. Actually, the Functional Programming paradigm complements Reactive Programming paradigm very nicely.
You don’t need to use the Functional Programming paradigm to follow Reactive Programming. You could use the good old imperative programming paradigm Java developers have traditionally used. Maybe at least. You’d be creating yourself a lot of headaches if you did. (Just because you can do something, does not mean you should do that something!)
Functional programming is important to Reactive Programming, but I’m not diving into functional programming in this post.
In this post, I want to look at the overall Reactive landscape in Java.
Reactive Programming vs. Reactive Streams
With these new buzz words, it’s very easy to get confused about their meaning.
Reactive programming is a programming paradigm, but I wouldn’t call it new. It’s actually been around for awhile.
Just like object-oriented programming, functional programming, or procedural programming, reactive programming is just another programming paradigm.
Reactive Streams, on the other hand, is a specification. For Java programmers, Reactive Streams is an API. Reactive Streams gives us a common API for Reactive Programming in Java.
The Reactive Streams API is the product of a collaboration between engineers from Kaazing, Netflix, Pivotal, Red Hat, Twitter, Typesafe, and many others.
Reactive Streams is much like JPA or JDBC. Both are API specifications. Both of which you need use implementations of the API specification.
For example, from the JDBC specification, you have the Java DataSource interface. The Oracle JDBC implementation will provide you an implementation of the DataSource interface. Just as Microsoft’s SQL Server JDBC implementation will also provide an implementation of the DataSource interface.
Now your higher-level programs can accept the DataSource object and should be able to work with the data source, and not need to worry if it was provided by Oracle or provided by Microsoft.
Just like JPA or JDBC, Reactive Streams gives us an API interface we can code to without needing to worry about the underlying implementation.
Reactive Programming
There are plenty of opinions around what Reactive programming is. There is plenty of hype around Reactive programming, too!
The best starting place to start learning about the Reactive Programming paradigm is to read the Reactive Manifesto. The Reactive Manifesto is a prescription for building modern, cloud-scale architectures.
The Reactive Manifesto is a prescription for building modern, cloud-scale architectures.
Reactive Manifesto
The Reactive Manifesto describes four key attributes of reactive systems:
Responsive
The system responds in a timely manner if at all possible. Responsiveness is the cornerstone of usability and utility, but more than that, responsiveness means that problems may be detected quickly and dealt with effectively. Responsive systems focus on providing rapid and consistent response times, establishing reliable upper bounds so they deliver a consistent quality of service. This consistent behaviour in turn simplifies error handling, builds end user confidence, and encourages further interaction.
Resilient
The system stays responsive in the face of failure. This applies not only to highly-available, mission critical systems — any system that is not resilient will be unresponsive after a failure. Resilience is achieved by replication, containment, isolation and delegation. Failures are contained within each component, isolating components from each other and thereby ensuring that parts of the system can fail and recover without compromising the system as a whole. Recovery of each component is delegated to another (external) component and high-availability is ensured by replication where necessary. The client of a component is not burdened with handling its failures.
Elastic
The system stays responsive under varying workload. Reactive Systems can react to changes in the input rate by increasing or decreasing the resources allocated to service these inputs. This implies designs that have no contention points or central bottlenecks, resulting in the ability to shard or replicate components and distribute inputs among them. Reactive Systems support predictive, as well as Reactive, scaling algorithms by providing relevant live performance measures. They achieve elasticity in a cost-effective way on commodity hardware and software platforms.
Message Driven
Reactive Systems rely on asynchronous message-passing to establish a boundary between components that ensures loose coupling, isolation and location transparency. This boundary also provides the means to delegate failures as messages. Employing explicit message-passing enables load management, elasticity, and flow control by shaping and monitoring the message queues in the system and applying back-pressure when necessary. Location transparent messaging as a means of communication makes it possible for the management of failure to work with the same constructs and semantics across a cluster or within a single host. Non-blocking communication allows recipients to only consume resources while active, leading to less system overhead.
The first three attributes (Responsive, Resilient, Elastic) are more related to your architecture choices. It’s easy to see why technologies such as microservices, Docker, and Kubernetes are important aspects of Reactive systems. Running a LAMP stack on a single server clearly does not meet the objectives of the Reactive Manifesto.
Message Driven and Reactive Programming
As Java developers, it’s the last attribute, the Message Driven attribute, that interests us most.
Message-driven architectures are certainly nothing revolutionary. If you need a primer on message driven systems, I’d like to suggest reading Enterprise Integration Patterns — a truly iconic computer science book. The concepts in this book laid the foundations for Spring Integration and Apache Camel.
A few aspects of the Reactive Manifesto that do interest us Java developers are: failures at messages, back-pressure, and non-blocking. These are subtle, but important aspects of Reactive programming in Java.
Failures as Messages
Often in Reactive programming, you will be processing a stream of messages. What is undesirable is to throw an exception and end the processing of the stream of messages.
The preferred approach is to gracefully handle the failure.
Maybe you needed to execute a web service and it was down. Maybe there is a backup service you can use? Or maybe retry in 10ms?
I’m not going to solve every edge case here. The key takeaway is you do not want to loudly fail with a runtime exception. Ideally, you want to note the failure, and have some type of retry or recovery logic in place.
Often, failures are handled with callbacks. JavaScript developers are well accustomed to using callbacks.
But callbacks can get ugly to use. JavaScript developers refer to this as callback hell.
In Reactive Steams, exceptions are first-class citizens. Exceptions are not rudely thrown. Error handling is built right into the Reactive Streams API specification.
Back Pressure
Have you ever heard of the phrase “Drinking from the firehose”?
Back pressure is a very important concept in Reactive programming. It gives downstream clients a way to say, "I’d some more, please."
Imagine if you’re making a query of a database and the result set returns back 10 million rows. Traditionally, the database will vomit out all 10 million rows as fast as the client will accept them.
When the client can’t accept any more, it blocks. And the database anxiously awaits. Blocked. The threads in the chain patiently wait to be unblocked.
In a Reactive world, we want our clients empowered to say give me the first 1,000. Then we can give them 1,000 and continue about our business – until the client comes back and asks for another set of records.
This is a sharp contrast to traditional systems where the client has no say. Throttling is done by blocking threads, not programmatically.
Non-Blocking
The final, and perhaps most important, aspect of Reactive architectures important to us Java developers is non-blocking.
Until Reactive came long, being non-blocking didn’t seem like that big of a deal.
As Java developers, we’ve been taught to take advantage of the powerful modern hardware by using threads. More and more cores meant we could use more and more threads. Thus, if we needed to wait on the database or a web service to return, a different thread could utilize the CPU. This seemed to make sense to us. While our blocked thread waited on some type of I/O, a different thread could use the CPU.
So, blocking is no big deal. Right?
Well, not so much. Each thread in the system will consume resources. Each time a thread is blocked, resources are consumed. While the CPU is very efficient at servicing different threads, there is still a cost involved.
We Java developers can be an arrogant bunch.
They’ve always looked down upon JavaScript. Kind of a nasty little language, preferred by script kiddies. Just the fact JavaScript shared the word ‘Java’ always made us Java programmers feel a bit dirty.
If you’re a Java developer, how many times have you felt annoyed when you have to point out that Java and JavaScript are two different languages?
Then Node.js came along.
And Node.js put up crazy benchmarks in throughput.
And then the Java community took notice.
Yep, the script kiddies had grown up and were encroaching on our turf.
It wasn’t that JavaScript running in the Google’s V8 JavaScript engine was some blazing fast godsend to programming. Java used to have its warts in terms of performance, but its pretty efficient, even compared to modern native languages.
The secret sauce of Node.js’s performance was non-blocking.
Node.js uses an event loop with a limited number of threads. While blocking in the Java world is often viewed as no big deal, in the Node.js world, it would be the kiss of death to performance.
These graphics can help you visualize the difference.
In Node.js, there is a non-blocking event loop. Requests are processed in a non-blocking manner. Threads do not get stuck waiting for other processes.
Contrast the Node.js model to the typical multi-threaded server used in Java. Concurrency is achieved through the use of multiple threads. Which is generally accepted due to the growth of multi-core processors.
I personally envision the difference between the two approaches as the difference between a super highway and lots of city streets with lights.
With a single thread event loop, your process is cruising quickly along on a super highway. In a Multi-threaded server, your process is stuck on city streets in stop and go traffic.
Both can move a lot of traffic. But, I’d rather be cruising at highway speeds!
When you move to a non-blocking paradigm, your code stays on the CPU longer. There is less switching of threads. You’re removing the overhead not only managing many threads, but also the context switching between threads.
You will see more headroom in the system capacity for your program to utilize.
Non-blocking is a not a performance holy grail. You’re not going to see things run a ton faster.
Yes, there is a cost to managing blocking. But all things considered, it is relatively efficient.
In fact, on a moderately utilized system, I’m not sure how measurable the difference would be.
But what you can expect to see, as your system load increases, is that you will have additional capacity to service more requests. You will achieve greater concurrency.
How much?
Good question. Use cases are very specific. As with all benchmarks, your mileage will vary.
The Reactive Streams API
Let’s take a look at the Reactive Streams API for Java. The Reactive Streams API consists of just four interfaces.
Publisher
A publisher is a provider of a potentially unbounded number of sequenced elements, publishing them according to the demand received from its Subscribers.
public interface Publisher<T> {
public void subscribe(Subscriber<? super T> s);
}
Subscriber
Will receive calls to Subscriber.onSubscribe(Subscription) once after passing an instance of Subscriber to Publisher.subscribe(Subscriber).
public interface Subscriber<T> {
public void onSubscribe(Subscription s);
public void onNext(T t);
public void onError(Throwable t);
public void onComplete();
}
Subscription
A Subscription represents a one-to-one lifecycle of a Subscriber subscribing to a Publisher.
public interface Subscription {
public void request(long n);
public void cancel();
}
Processor
A Processor represents a processing stage — which is both a Subscriber and a Publisher and obeys the contracts of both.
public interface Processor<T, R> extends Subscriber<T>, Publisher<R> {
}
Reactive Streams Implementations for Java
The reactive landscape in Java is evolving and maturing. David Karnok has a great blog post on Advanced Reactive Java, in which he breaks down the various reactive projects into generations. I’ll note the generations of each below – (which may change at any time with a new release).
RxJava
RxJava is the Java implementation out of the ReactiveX project. At the time of writing, the ReactiveX project had implementations for Java, JavaScript, .NET (C#), Scala, Clojure, C++, Ruby, Python, PHP, Swift, and several others.
ReactiveX provides a reactive twist on the GoF Observer pattern, which is a nice approach. ReactiveX calls their approach ‘Observer Pattern Done Right’.
ReactiveX is a combination of the best ideas from the Observer pattern, the Iterator pattern, and functional programming.
RxJava predates the Reactive Streams specification. While RxJava 2.0+ does implement the Reactive Streams API specification, you’ll notice a slight difference in terminology.
David Karnok, who is a key committer on RxJava, considers RxJava a 3rd-generation reactive library.
Reactor
Reactor is a Reactive Streams-compliant implementation from Pivotal. As of Reactor 3.0, Java 8 or above is a requirement.
The reactive functionality found in Spring Framework 5 is built upon Reactor 3.0.
Reactor is a 4th-generation reactive library. (David Karnok is also a committer on project Reactor)
Akka Streams
Akka Streams also fully implements the Reactive Streams specification. Akka uses Actors to deal with streaming data. While Akka Streams is compliant with the Reactive Streams API specification, the Akka Streams API is completely decoupled from the Reactive Streams interfaces.
Akka Streams is considered a 3rd-generation reactive library.
Ratpack
Ratpack is a set of Java libraries for building modern high-performance HTTP applications. Ratpack uses Java 8, Netty, and Reactive principles. Ratpack provides a basic implementation of the Reactive Stream API but is not designed to be a fully-featured reactive toolkit.
Optionally, you can use RxJava or Reactor with Ratpack.
Vert.x
Vert.x is an Eclipse Foundation project. It is a polyglot event-driven application framework for the JVM. Reactive support in Vert.x is similar to Ratpack. Vert.x allows you to use RxJava or their native implementation of the Reactive Streams API.
Reactive Streams and JVM Releases
Reactive Streams for Java 1.8
With Java 1.8, you will find robust support for the Reactive Streams specification.
In Java 1.8 Reactive streams is not part of the Java API. However, it is available as a separate jar.
Reactive Streams Maven Dependency
<dependency>
<groupId>org.reactivestreams</groupId>
<artifactId>reactive-streams</artifactId>
<version>1.0.0</version>
</dependency>
While you can include this dependency directly, whatever implementation of Reactive Streams you are using should include it automatically as a dependency.
Reactive Streams for Java 1.9
Things change a little bit when you move to Java 1.9. Reactive Streams has become part of the official Java 9 API.
You’ll notice that the Reactive Streams interfaces move under the Flow class in Java 9. But other than that, the API is the same as Reactive Streams 1.0 in Java 1.8.
Conclusion
At the time of writing, Java 9 is right around the corner. In Java 9, Reactive Streams is officially part of the Java API.
In researching this article, it’s clear the various reactive libraries have been evolving and maturing (i.e. David Karnok generations classification).
Before Reactive Streams, the various reactive libraries had no way of interoperability. They could not talk to each other. Early versions of RxJava were not compatible with early versions of project Reactor.
But on the eve of the release of Java 9, the major reactive libraries have adopted the Reactive Streams specification. The different libraries are now interoperable.
Interoperability is an important domino to fall. For example, MongoDB has implemented a Reactive Streams driver. Now, in our applications, we can use Reactor or RxJava to consume data from a MongoDB.
We’re still early in the adaptation of Reactive Streams. But over the next year or so, we can expect more and more open source projects to offer Reactive Streams compatibilities.
I expect we are going to see a lot more of Reactive Streams in the near future.
It’s fun time to be a Java developer!
Published at DZone with permission of John Thompson, DZone MVB. See the original article here.
Opinions expressed by DZone contributors are their own.
Comments