Running Spring Batch Applications in PCF
In this post, we cover how to use Spring Batch to create micro services that we'll then deploy to a cloud environment.
Join the DZone community and get the full member experience.
Join For Free1. Overview
Most developers are creating microservices these days and deploying to cloud platforms. Pivotal Cloud Foundry (PCF) is one of the most well known cloud platforms. When we talk about deploying applications on PCF, mostly they would be long-running processes which never end, like web applications, SPAs, or REST-based services. PCF monitors all these long-running instances and if one goes down it spins up the new instance to replace the failed one. This works fine where the process is expected to run continuously but for a batch process, it's overkill. The container will be running all the time with no CPU usage and the cost will add up. Many developers have an ambiguity that PCF cannot run a batch application which can just be initiated based on a request. But that is not correct.
Spring Batch enables us to create a batch application and provides many out-of-the-box features to reduce boilerplate code. Recently, Spring Cloud Task has been added in the list of projects to create short-running processes. With both of these options, we can create microservices, deploy them on PCF, and then stop them so that PCF doesn't try to self heal them. And with the help of PCF Scheduler, we can schedule the task to run them at a certain time of the day. Let's see in this article how we can do that with very few steps.
2. Pre-Requisite
- JDK 1.8
- Spring Boot knowledge
- Gradle
- IDE (Eclipse, VSC, etc.)
- PCF instance
3. Develop the Spring Batch Application
Let's develop a small Spring Batch application (spring-batch-master) which will read a file with employee data and then add the department name to each of the employee's records.
3.1 BatchConfiguration
Let's start with tje BatchConfiguration file. I have added two jobs here and both show a different way of implementing the batch process. The first one is using Spring Batch Chunks and is configured to set up the job flow with a few steps. Each step will have a reader, processor, and writer configured. The second way of implementing a batch process is using Tasklet:
@Configuration
public class BatchConfiguration {
@Bean
public JobLauncher jobLauncher(JobRepository jobRepo) {
SimpleJobLauncher simpleJobLauncher = new SimpleJobLauncher();
simpleJobLauncher.setJobRepository(jobRepo);
return simpleJobLauncher;
}
@Bean
public Job departmentProcessingJob() {
return jobBuilderFactory.get("departmentProcessingJob")
.flow(step1())
.end()
.build();
}
@Bean
public Step step1() {
return stepBuilderFactory.get("step1")
.<Employee, Employee>chunk(1)
.reader(reader())
.processor(processor())
.writer(writer())
.build();
}
@Bean
public Job job2() {
return this.jobBuilderFactory.get("job2")
.start(this.stepBuilderFactory.get("job2step1")
.tasklet(new Tasklet() {
@Override
public RepeatStatus execute
(StepContribution contribution, ChunkContext chunkContext)
throws Exception {
logger.info("Job2 was run");
return RepeatStatus.FINISHED;
}
})
.build())
.build();
}
}
Let's discuss first job, departmentProcessingJob
, in a little more detail. This job has thestep1
function, which reads the file, processes it, and then prints it.
3.2 DepartmentReader
This code has logic to read the employee data from a file as part of the first step:
public class DepartmentReader {
public FlatFileItemReader<Employee> reader() {
FlatFileItemReader<Employee> reader = new FlatFileItemReader<Employee>();
reader.setResource(new ClassPathResource("employee_data.txt"));
reader.setLineMapper(new DefaultLineMapper<Employee>() {{
setLineTokenizer(new DelimitedLineTokenizer() {{
setNames(new String[]{"id", "employeenumber", "salary"});
}});
setFieldSetMapper(new BeanWrapperFieldSetMapper<Employee>() {{
setTargetType(Employee.class);
}});
}});
return reader;
}
}
3.3 DepartmentProcessor
This code has logic to add the department name to each employee record, based on some condition:
public class DepartmentProcessor implements ItemProcessor<Employee, Employee> {
@Override
public Employee process(Employee item) throws Exception {
if ("1001".equalsIgnoreCase(item.getEmployeeNumber())) {
item.setDepartment("Sales");
} else if ("1002".equalsIgnoreCase(item.getEmployeeNumber())) {
item.setDepartment("IT");
} else {
item.setDepartment("Staff");
}
System.out.println("Employee Details --> " + item.toString());
return item;
}
}
3.4 DepartmentWriter
This code has logic to print the employee records with the department name appended to it:
public class DepartmentWriter implements ItemWriter<Employee> {
@Override
public void write(List<? extends Employee> items) throws Exception {
List<String> employeeList = new ArrayList<>();
items.forEach(item -> {
String enrichedTxn = String.join(",", item.getId(),
item.getEmployeeNumber(), item.getSalary(),
item.getDepartment());
employeeList.add(enrichedTxn);
});
employeeList.forEach(System.out::println);
}
}
3.5 BatchCommandLineRunner
Now, let's bootstrap the application logic to run as a batch process. Spring provides CommandLineRunner to enable this:
@Component
public class BatchCommandLineRunner implements CommandLineRunner {
@Autowired
JobLauncher jobLauncher;
@Autowired
Job departmentProcessingJob;4.
public void run(String... args) throws Exception {
JobParameters param = new JobParametersBuilder()
.addString("JobID", String.valueOf(System.currentTimeMillis()))
.toJobParameters(); jobLauncher.run(departmentProcessingJob, param);
jobLauncher.run(job2, param);
}
}
4. Deploy the Application on PCF
So far we have created a Spring Batch job. Now let's deploy it to PCF. You just need to package the code and cf push
to PCF with a manifest file.
manifest.yaml:
---
applications:
- name: batch-example
memory: 1G
random-route: true
path: build/libs/spring-batch-master-0.0.1-SNAPSHOT.jar
no-hostname: true
no-route: true
health-check-type: none
We will see that application started, executed the logic, and then exited. The application will display as 'crashed' in the PCF Apps Manager.
In PCF, all the applications run withprocess type: web
. It expects the application to be running all the time on some web port. However, for a Spring Batch application, this is not the case. So let's see how to handle that:
- Stop the application manually.
- Run this job either manually with the
cf run-task
command or schedule it usingPCF Scheduler
5. Start Our Spring Batch Application With the CF CLI
To run the Spring Batch (Boot) application on PCF, we need to run the following command:
cf run-task <APP Name>
".java-buildpack/open_jdk_jre/bin/java org.springframework.boot.loader.JarLauncher"
Now, you can use this command in a Bamboo/Jenkins pipeline to trigger the application with a cron job.
6. Schedule Batch Job With PCF Scheduler
We can go to the application in Apps Manager -> Tasks and click on Enable Scheduling to bind the application with the PCF Scheduler. Now you can create a job as shown in the below picture. For more details on how to use PCF Scheduler, you can read this blog.
Now, if we run this code, it should execute both the bobs and give the below results:
7. Batch Applications With Spring Cloud Task
Spring has come up with a new project called Spring Cloud Task (SCT). Its purpose is to create short-lived microservices on cloud platforms. We just need to add a the @EnableTask
annotation. This will register TaskRepository
and creates the TaskExecution
, which will pick up the job defined and execute them one by one.
TaskRepository
, by default, uses an in-memory DB, however, it can support most of the persistence DBs like Oracle, MySQL, PostgreSQL, etc.
So let's develop one more small application (sct-batch-job) with SCT. Put @EnableTask
in the Spring Boot Main Class:
@SpringBootApplication
@EnableTask
@EnableBatchProcessing
public class EmployeeProcessingBatch {
public static void main(String[] args) {
SpringApplication.run(EmployeeProcessingBatch.class, args);
}
}
Add two Jobs in the SCTJobConfiguration
file:
@Configuration
public class SCTJobConfiguration {
private static final Log logger = LogFactory.getLog(SCTJobConfiguration.class);
@Autowired
public JobBuilderFactory jobBuilderFactory;
@Autowired
public StepBuilderFactory stepBuilderFactory;
@Bean
public Job job1() {
return this.jobBuilderFactory.get("job1")
.start(this.stepBuilderFactory.get("job1step1")
.tasklet(new Tasklet() {
@Override
public RepeatStatus execute
(StepContribution contribution, ChunkContext chunkContext)
throws Exception {
logger.info("Job1 ran successfully");
return RepeatStatus.FINISHED;
}
})
.build())
.build();
}
@Bean
public Job job2() {
return this.jobBuilderFactory.get("job2")
.start(this.stepBuilderFactory.get("job2step1")
.tasklet(new Tasklet() {
@Override
public RepeatStatus execute
(StepContribution contribution, ChunkContext chunkContext)
throws Exception {
logger.info("Job2 ran successfully");
return RepeatStatus.FINISHED;
}
})
.build())
.build();
}
}
That's it. So if you'll notice, I have used @EnableBatchProcessing
to integrate SCT with Spring Batch. That way we can run a Spring Batch application as a Task. We can now push this application to PCF as we did the earlier one and then either run it manually or schedule it using PCF Scheduler.
8. Conclusion
In this article, we have talked about how a Spring Batch application can be run on PCF. We can also use Spring Cloud Task to run short-lived microservices. Spring Cloud Task also provides integration with Spring Batch so you can use full benefits of Batch as well as Spring Cloud Task.
As usual, the code can be found over on GitHub - Spring Batch App, Spring Cloud Task App
Opinions expressed by DZone contributors are their own.
Comments