How to Convert Files to Thumbnail Images in Java

This article explores the benefits of and solutions for programmatically generating thumbnail images for documents passing through our file processing systems.

Brian O'Neill

CORE ·

Sep. 25, 24 · Tutorial

Likes (3)

Comment

Save

5.3K Views

In this article, we’ll discuss the benefits of programmatically generating thumbnail images for a wide range of file types. We’ll take a high-level look at resolving this workflow with open-source Java solutions, and we’ll learn how to take advantage of an efficient web API to get the job done.

Why Convert Documents to Thumbnail Images?

"Converting" documents to thumbnail images — i.e., programmatically generating thumbnail images from document contents — creates lightweight content previews that can be viewed quickly by document recipients with minimal effort.

I’ve written here about the benefits of converting files to image arrays (e.g., PNG and JPG) in the past, and there’s some crossover between those concepts and the ideas behind generating thumbnail images — along with some key differences.

At their core, both image conversion processes aim to generate lightweight image results, and they both attempt to represent content in a consistent, predictable way across multiple systems/platforms. While PNG/JPG document image arrays are typically generated to replace the original pre-conversion document format (e.g., DOCX, PDF, XLSX, etc.), thumbnail images are conversely intended to accompany the original content permanently, allowing those with document access the opportunity to peek at its contents and quickly absorb its relevancy. Because thumbnails are intended to accompany files rather than replace them, they’re typically much smaller in size and less focused on delivering crisp image quality. Image resizing also becomes an essential part of the process.

Why Automate This Process?

In general, it’s a good idea to automate any workflows we think will eventually take place downstream from our file processing system — especially if there's a readily available, efficient way for us to do that. We can proactively increase the value of our system by mercifully replacing cumbersome manual processes with consistent, reliable programmatic outputs.

It’s not uncommon, for example, for file processing systems to create thumbnail images for PDF documents and PowerPoint presentations before those files are sent to recipients in external networks via automated emails. PDFs and PowerPoints tend to be exceptionally large and slow to open in email preview tools (especially when they contain multiple pages/slides and multimedia), and like most files, they tend to raise flags in email applications when arriving from "untrusted" external networks, given they technically have the potential to execute malicious code and exploit vulnerabilities in document viewers. By accompanying those files with thumbnails, we can help assure recipients that the content they’ve received matches their expectations and is safe to open.

It’s also worth noting that we’ll find automated thumbnail-generating features present in many popular video streaming platforms. Typically, one or two frames are automatically extracted from the video file, and video uploaders have the option to represent their content with those frames in the video platform’s gallery. The same goes for image-hosting platforms; we’ll often notice small preview versions of images accompanying larger PNG, JPG, or Vector Graphic files in a gallery.

Generating Thumbnail Images in Java

Generating thumbnail images requires two basic steps:

Converting at least one page (or frame; typically, the first of either) of the original content to a lightweight image format (e.g., JPG or PNG)
Resizing the resulting image to a contextually suitable set of width and height dimensions

There are a few noteworthy open-source solutions we could use to generate document thumbnails in Java.

If we’re envisioning a workflow that ONLY deals with PDF documents, for example, we could use Apache PDFBox, which is a quite mature and well-supported library. We can use this library to create, manipulate, and extract content from PDF documents — and that includes directly rendering PDF pages as images and subsequently resizing them.

Similarly, if we were exclusively focused on generating thumbnails for DOCX files — another exceedingly popular file type — we could use Apache POI (a library I’ve suggested in several past articles). It’s a great library for working with Microsoft Office documents of all kinds, and we can use it in conjunction with image libraries like Apache Batik to extract, render as an image, and resize the first page of DOCX files. If we're generating thumbnails from image formats like JPG, PNG, GIF, etc., we could use Java’s built-in ImageIO and BufferedImage APIs, both of which are part of Java SE. They’re standard APIs for image manipulation, and they’re good solutions for limited-scale workflows.

Demonstration

Below, we’ll learn how to take advantage of a specialized web API that generates PNG thumbnail images from a wide range of input document types, including PDF, Office documents, and dozens of image formats. We’ll be able to use this solution with a free API key.

By using a web API solution, we’ll abstract the memory from our conversion and PNG resizing operations to an external server, and we’ll consolidate both our necessary steps into a single operation, reducing the amount of code we need overall. After completing our conversion, we’ll return a byte[] result containing our thumbnail’s raw PNG image data.

We’ll now walk through structuring our API call with Java code examples.

To get started, we’ll add the client to our Maven project. We’ll first add a reference to the repository in pom.xml:

    XML
   
 

   <repositories>
    <repository>
        <id>jitpack.io</id>
        <url>https://jitpack.io</url>
    </repository>
</repositories>
  

Then we’ll add a reference to the dependency in pom.xml:

    XML
   
 

   <dependencies>
<dependency>
    <groupId>com.github.Cloudmersive</groupId>
    <artifactId>Cloudmersive.APIClient.Java</artifactId>
    <version>v4.25</version>
</dependency>
</dependencies>
  

Next, we’ll import the necessary classes for API client setup and document conversion:

    Java
   
 

   // Import classes:
//import com.cloudmersive.client.invoker.ApiClient;
//import com.cloudmersive.client.invoker.ApiException;
//import com.cloudmersive.client.invoker.Configuration;
//import com.cloudmersive.client.invoker.auth.*;
//import com.cloudmersive.client.ConvertDocumentApi;
  

Now we’ll configure the ApiClient with our API key by setting the default client and specifying the key (we can optionally set a prefix for the key if needed):

    Java
   
   ApiClient defaultClient = Configuration.getDefaultApiClient();

// Configure API key authorization: Apikey
ApiKeyAuth Apikey = (ApiKeyAuth) defaultClient.getAuthentication("Apikey");
Apikey.setApiKey("YOUR API KEY");
// Uncomment the following line to set a prefix for the API key, e.g. "Token" (defaults to null)
//Apikey.setApiKeyPrefix("Token");

Finally, we’ll create an instance of the ConvertDocumentApi, specify the input file, and (optionally) define the maximum dimensions and input file extension. The default width and height values are 128 pixels each, and the maximum value for both is 2,048 pixels:

    Java
   
 

   ConvertDocumentApi apiInstance = new ConvertDocumentApi();
File inputFile = new File("/path/to/inputfile"); // File | Input file to perform the operation on.
Integer maxWidth = 56; // Integer | Optional; Maximum width of the output thumbnail - final image will be as large as possible while less than or equal to this width. Default is 128.
Integer maxHeight = 56; // Integer | Optional; Maximum height of the output thumbnail - final image will be as large as possible while less than or equal to this width. Default is 128.
String extension = "extension_example"; // String | Optional; Specify the file extension of the inputFile. This will improve the response time in most cases. Also allows unsupported files without extensions to still return a corresponding generic icon.
try {
    byte[] result = apiInstance.convertDocumentAutodetectToThumbnail(inputFile, maxWidth, maxHeight, extension);
    System.out.println(result);
} catch (ApiException e) {
    System.err.println("Exception when calling ConvertDocumentApi#convertDocumentAutodetectToThumbnail");
    e.printStackTrace();
}
  

We can now implement our own code to handle the byte[] response in various ways, and we can diagnose any exceptions by logging errors and stack traces.

Conclusion

In this article, we discussed the idea behind creating thumbnail images and reviewed the benefits of generating thumbnails automatically in a file processing system. We then went over a few open-source options for creating thumbnail images in Java and subsequently learned how to consolidate the process into one efficient step using a specialized web API.

Document Web API Convert (command) Java (programming language) Data Types

Opinions expressed by DZone contributors are their own.

Related

Trending