Solving Java Multithreading Challenges in My Google Photos Clone [Video]
Find out what challenges we face as we turn our Google Photos clone from single-threaded to multithreaded to generate thumbnails much faster than before.
Join the DZone community and get the full member experience.
Join For FreeWe want to turn our Google Photos clone from single-threaded to multithreaded, to generate thumbnails much faster than before - but there are a couple of challenges along the way. We'll learn that using Java's Parallel File Streams seems to be a buggy endeavor, so we'll fall back on good, old ExecutorService
s.
But how many threads should we use to generate thumbnails? How can threads get in conflict with each other? How do we make our program fail-safe for threading issues? Find out in this episode!
What’s in the Video
00:00 Intro
We start off with a quick recap. In the previous episodes, we built a tiny application that can take a folder full of images, and turn those images into thumbnails - by spawning external ImageMagick processes. We did that sequentially, spawning the next process, as soon as a thumbnail conversion process has finished.
But we surely can make this faster and also utilize our system resources (CPU/IO) more by doing the thumbnail conversion multithreaded, spawning multiple ImageMagick processes at the very same time. We'll try and figure out how to do that in this episode.
00:24 Java’s Parallel Streams
The first idea would be to use Java's built-in parallel streams feature, as we are reading in the files as a stream anyway. Interestingly enough the API lets you do this just fine, and it even works flawlessly on my machine, but as soon as we deploy our application to a different server, it stops working. Why is that? We'll need to do a bit of benchmarking and fumbling around, to notice that parallel file streams, in JDKs < 19, aren't really supported. So, depending on the Java version, you'll get different behavior. Hence, we cannot work with parallel streams for now.
03:32 Java’s ExecutorService
Given that parallel streams are not an option, we will resort back to using a good old ExecutorService
. An ExecutorService
lets us define how many threads we want to open, and then work off n-tasks in parallel. Figuring out the API is not that difficult, but the real question is: How many threads specifically should we open up simultaneously? We'll cover that question in detail during this segment.
06:12 Performance Benchmarking
After having implemented multithreading, we also need to make sure to benchmark our changes. Will we get a 2x/3x speed improvement? Or maybe even a speed reduction? During this segment, we'll run and time our application locally, as well as on my NAS, and see how different hardware configurations might affect the final result.
08:10 File Storage and Hashing
Last but not least, we'll have to figure out how to store our thumbnails. So far, we created thumbnails with the same filename as the original image and put all the files into the same directory. That doesn't work for a huge amount of files, with potential file clashes and multithreading conflicts. Hence, we will start hashing our files with the BLAKE3 algorithm, store the files under that hash, and also use a directory layout similar to what Git uses internally to store its files.
16:52 Up Next
We did a ton of multithreading work in this episode. Up next it is time to add a database to our application and store the information about all converted thumbnails there. Stay tuned!
Opinions expressed by DZone contributors are their own.
Comments