Single vs. Multiple Filters in the Java Stream API
It might be tempting to run multiple filters in your streams, but be careful—it might come with a cost. Use your filters judiciously.
Join the DZone community and get the full member experience.
Join For FreeOne of the key features of Java 8 is the stream. It is frequently used in conjunction with lambdas, and one of them is the filter.
Let's consider the following example:
long count = doubles
.stream()
.filter(d -> d < Math.PI)
.filter(d -> d > Math.E)
.filter(d -> d != 3.10040970053377777)
.filter(d -> d != 2.96240970053377777)
.count();
It doesn't do anything fancy—perhaps it has no practical use case. However, for now, let's consider how the filter works. Each filter() method returns a new stream, so there in effect four extra steam.
However, of the four filters that can be written, one which has slightly less overhead. Let's compare these two ideas and see much benefit we can derive.
import org.openjdk.jmh.annotations.Benchmark;
import org.openjdk.jmh.annotations.BenchmarkMode;
import org.openjdk.jmh.annotations.Mode;
import org.openjdk.jmh.annotations.OutputTimeUnit;
import java.util.List;
import java.util.Random;
import java.util.concurrent.TimeUnit;
import java.util.stream.Collectors;
public class MyBenchmark {
@Benchmark
@BenchmarkMode(Mode.All)
@OutputTimeUnit(TimeUnit.SECONDS)
public long testStreamWithSingleFilter() {
List<Double> doubles = new Random().doubles(1_000, 1, 4).boxed().collect(Collectors.toList());
long count = doubles
.stream()
.filter(d -> d < Math.PI
&& d > Math.E
&& d != 3.10040970053377777
&& d != 2.96240970053377777)
.count();
return count;
}
@Benchmark
@BenchmarkMode(Mode.All)
@OutputTimeUnit(TimeUnit.SECONDS)
public long testStreamWithMultipleFilter() {
List<Double> doubles = new Random().doubles(1_000, 1, 4).boxed().collect(Collectors.toList());
long count = doubles
.stream()
.filter(d -> d > Math.E)
.filter(d -> d < Math.PI)
.filter(d -> d != 3.10040970053377777)
.filter(d -> d != 2.96240970053377777)
.count();
return count;
}
}
Output:
# Run complete. Total time: 00:40:19
Benchmark Mode Cnt Score Error Units
MyBenchmark.testStreamWithMultipleFilter thrpt 200 24367.016 ± 169.686 ops/s
MyBenchmark.testStreamWithSingleFilter thrpt 200 32779.157 ± 127.938 ops/s
MyBenchmark.testStreamWithMultipleFilter avgt 200 ≈ 10⁻⁴ s/op
MyBenchmark.testStreamWithSingleFilter avgt 200 ≈ 10⁻⁵ s/op
MyBenchmark.testStreamWithMultipleFilter sample 2581418 ≈ 10⁻⁴ s/op
MyBenchmark.testStreamWithMultipleFilter:testStreamWithMultipleFilter·p0.00 sample ≈ 10⁻⁴ s/op
MyBenchmark.testStreamWithMultipleFilter:testStreamWithMultipleFilter·p0.50 sample ≈ 10⁻⁴ s/op
MyBenchmark.testStreamWithMultipleFilter:testStreamWithMultipleFilter·p0.90 sample ≈ 10⁻⁴ s/op
MyBenchmark.testStreamWithMultipleFilter:testStreamWithMultipleFilter·p0.95 sample ≈ 10⁻⁴ s/op
MyBenchmark.testStreamWithMultipleFilter:testStreamWithMultipleFilter·p0.99 sample ≈ 10⁻⁴ s/op
MyBenchmark.testStreamWithMultipleFilter:testStreamWithMultipleFilter·p0.999 sample 0.001 s/op
MyBenchmark.testStreamWithMultipleFilter:testStreamWithMultipleFilter·p0.9999 sample 0.001 s/op
MyBenchmark.testStreamWithMultipleFilter:testStreamWithMultipleFilter·p1.00 sample 0.006 s/op
MyBenchmark.testStreamWithSingleFilter sample 3292270 ≈ 10⁻⁵ s/op
MyBenchmark.testStreamWithSingleFilter:testStreamWithSingleFilter·p0.00 sample ≈ 10⁻⁵ s/op
MyBenchmark.testStreamWithSingleFilter:testStreamWithSingleFilter·p0.50 sample ≈ 10⁻⁵ s/op
MyBenchmark.testStreamWithSingleFilter:testStreamWithSingleFilter·p0.90 sample ≈ 10⁻⁵ s/op
MyBenchmark.testStreamWithSingleFilter:testStreamWithSingleFilter·p0.95 sample ≈ 10⁻⁵ s/op
MyBenchmark.testStreamWithSingleFilter:testStreamWithSingleFilter·p0.99 sample ≈ 10⁻⁴ s/op
MyBenchmark.testStreamWithSingleFilter:testStreamWithSingleFilter·p0.999 sample ≈ 10⁻⁴ s/op
MyBenchmark.testStreamWithSingleFilter:testStreamWithSingleFilter·p0.9999 sample 0.001 s/op
MyBenchmark.testStreamWithSingleFilter:testStreamWithSingleFilter·p1.00 sample 0.011 s/op
MyBenchmark.testStreamWithMultipleFilter ss 10 0.010 ± 0.001 s/op
MyBenchmark.testStreamWithSingleFilter ss 10 0.009 ± 0.001 s/op
As you can see, the single filter took less time than using multiple ones.
Source code: here
The takeaway: Multiple filters have some overhead; make sure to write good filters.
Opinions expressed by DZone contributors are their own.
Comments