Using Spring To Download a Zip File, Extract It, and Upload It to Cloud Storage Without Storing Files Locally in the Container

The article explains how to use Spring to download a zip file, extract it, and upload it to cloud storage without storing files locally in the container.

Amol Gote

CORE ·

Jul. 03, 23 · Code Snippet

Likes (4)

Comment

Save

11.1K Views

There are use cases in which, as part of the integration work, you might need to download the zip file from one of your partners, extract the zip file content and then move the extracted files to cloud storage. We had a similar need to download the uploaded ID (Driver’s license) images (Front/Back) from one of the leading ID verification service providers, which persists in the organization’s cloud storage. The challenge lies in downloading a zip file, extracting its contents, and uploading them to cloud storage — all in a transient manner without creating any temporary files on the container of your microservice.

Downloading the Zipped File Content From Third-Party Service

Below is the code snippet to get the zipped file from the partner service.

     Java 
   
 
 
   RestTemplate restTemplate = new RestTemplate();
 HttpHeaders headers = new HttpHeaders();
 headers.set("Content-Type", "application/zip");
 headers.set("Authorization", "Bearer " + accessToken);
 HttpEntity request = new HttpEntity(headers);
 String url = this.baseUrl + "/documents/" + documentUUID + "?imagequality=original";
 ResponseEntity<Resource> response = restTemplate.exchange(url, HttpMethod.GET, request, Resource.class); 
  

In this case, the URL is the path for the partner service where images zipped can be downloaded. This is typical RestTemplate code that is very common in the Spring ecosystem. There are two things to be noted over here.

Content-Type — It is of the application/zip type, as the response would be a streamed zip file.
RestTemplate response return type — it is of Resource type, which is an interface in Spring to represent external resources.

To get the resource object, perform the following steps in the code. See below code snippet.

     Java 
   
   Resource zipFileContent = response.getBody();

Extracting Zip File Content

Once the resource object is created, then the real work of extracting and moving the extracted files starts. Below is the code snippet for extracting the zip files.

     Java 
   
 
 
   HashMap<String, byte[]> map = new HashMap<>();
 if (zipFileContent != null) {
     ZipInputStream zipInputStream = new ZipInputStream(zipFileContent.getInputStream());
     ZipEntry zipEntry = null;
     byte[] buff = new byte[4096];
     while ((zipEntry = zipInputStream.getNextEntry()) != null)
     {
         ByteArrayOutputStream byteArrayOutputStream = new ByteArrayOutputStream();
         // consume all the data from this entry
         while (zipInputStream.available() > 0) {
             int byteLength = 0;
             while ((byteLength = zipInputStream.read(buff)) > 0) {
                 byteArrayOutputStream.write(buff, 0, byteLength);
             }
         }
         byteArrayOutputStream.close();
         map.put(zipEntry.getName(), byteArrayOutputStream.toByteArray());
     }
     zipInputStream.close();
 } 
  

The Java class ZipInputStream is an input stream filter used for reading individual files from a ZIP file. It provides an iterative mechanism to iterate through multiple zip entries. As part of this iteration, we can read individual files into a byte array using ByteArrayOutputStream. This creates a transient representation of each individual file in the zip file. Each file's byte array is then stored in a HashMap for subsequent uploading to cloud storage.

Now you can iterate through the map to extract and upload individual file to cloud storage.

     Java 
   
 
 
   for (String fileName :
         map.keySet()) {
     byte[] fileBytes = map.get(fileName);
     ByteArrayResource byteArrayResource = this.getDocumentByteArray(fileBytes, fileName);
 } 
  

     Java 
   
 
 
   private ByteArrayResource getDocumentByteArray(byte[] bytes, String fileName) {
     try {
         final ByteArrayResource byteArrayResource = new ByteArrayResource(bytes) {
             @Override
             public String getFilename() {
                 return fileName;
             }
         };
         return byteArrayResource;
     } catch (Exception ex) {
         logger.error("Exception - getDocumentByteArray - Error while getting the uploaded images byte array content, detail error : ", ex);
     }
     return null;
 } 
  

You can then take the Resource object (ByteArrayResource) and post it to your internal documents upload API, which will then post it to your respective cloud storage be it AWS or Azure.

     Java 
   
 
 
   public DocumentsResponse uploadDocV1(String accessToken, Resource file) {
     RestTemplate restTemplate = new RestTemplate();
     HttpHeaders headers = new HttpHeaders();
     headers.setContentType(MediaType.MULTIPART_FORM_DATA);
     headers.set("Authorization", "Bearer " + accessToken);
 
     MultiValueMap<String, Object> body = new LinkedMultiValueMap<>();
     DocumentsResponse result = null;
     body.add("file", file);
     HttpEntity<MultiValueMap<String, Object>> entity = new HttpEntity<>(body, headers);
 
     String apiEndPoint = this.baseUrl + "/api/documents/v1/upload";
     try {
         URI uri = new URI(apiEndPoint);
         result = restTemplate.postForObject(uri, entity, DocumentsResponse.class);
     } catch (Exception ex) {
         logger.error("An error occurred while uploading, detail error : ", ex);
     }
     return result;
 } 
  

“/api/documents/v1/upload" — This is an internal API (microservice) that is responsible for uploading documents to cloud storage.

Document Upload to Cloud Storage

Document upload API accepts the file as MultipartFile, which then posts that file to AWS S3 bucket.

     Java 
   
 
 
   private String uploadDocument(String s3BucketPath, MultipartFile multipartFile) throws Exception {
     try (InputStream documentStream = multipartFile.getInputStream()) {
 
         ObjectMetadata metadata = new ObjectMetadata();
         metadata.setContentType(multipartFile.getContentType());
 
         Map<String, String> attributes = new HashMap<>();
         attributes.put("document-content-size", String.valueOf(multipartFile.getSize()));
 
         metadata.setUserMetadata(attributes);
         PutObjectResult result = this.awsS3Client.putObject(new PutObjectRequest(this.s3bucket, 
                 s3BucketPath, documentStream, metadata));
 
         logger.info("Saved successfully to S3 bucket with keyName={}", s3BucketPath);
 
         return s3BucketPath;
     } catch (AmazonS3Exception ex) {
         logger.warn("s3Bucket={}. Key={}", s3bucket, s3BucketPath);
         if (ex.getErrorCode().equalsIgnoreCase("NoSuchBucket")) {
             String msg = String.format("No bucket found with name %s", s3bucket);
             logger.error(msg, ex);
             throw new DocumentException(true, msg);
         } else if (ex.getErrorCode().equalsIgnoreCase("AccessDenied")) {
             String msg = String.format("Access denied to S3 bucket %s", s3bucket);
             logger.error(msg, ex);
             throw new DocumentException(true, msg);
         }
 
         logger.error(String.format("Error saving file %s to AWS S3 bucket %s", s3BucketPath, s3bucket), ex);
         throw ex;
     } catch (IOException ex) {
         logger.warn("s3Bucket={}. Key={}", s3bucket, s3BucketPath);
         logger.error(String.format("Error saving file %s to AWS S3 bucket %s", s3BucketPath, s3bucket), ex);
         throw ex;
     }
 } 
  

In conclusion, this article has demonstrated how to efficiently handle ZIP files in a Spring-based application without the need for temporary storage. We've walked through the process of downloading a ZIP file from a partner service, extracting the contents, and uploading the files to cloud storage, all without creating temporary files on the microservice container.

Remember, the code provided here is a base from which you can build and adapt to suit your needs. It is important to tailor this to your specific use case, ensuring that it is secure, efficient, and reliable.

For those looking to explore more about Spring or AWS, I would recommend visiting their official documentation. If you're interested in file handling in Java, the Java I/O streams tutorial could be a good starting point.

By understanding and implementing these techniques, you can streamline your data processing tasks and make your applications more efficient and resilient. Happy coding!

Cloud storage Container Spring Data Java (programming language)

Opinions expressed by DZone contributors are their own.

Related

Trending