1 theoretical basis before actual combat
1.1 what is spring batch
Spring Batch is a lightweight and comprehensive batch processing framework designed to support the development of powerful batch processing applications that are crucial to the daily operation of enterprise systems. At the same time, developers can easily access and utilize more advanced enterprise services when necessary. Spring Batch is not a scheduling framework. It is designed to work with the scheduler, not replace it.
1.2 what can spring batch do
- Automated and complex mass information processing, which can be processed most effectively without user interaction. These operations typically include time-based events (such as month end calculations, notifications, or communications).
- Complex business rules (e.g., insurance benefit determination or rate adjustment) that are applied periodically and repeatedly on very large data sets.
- Integrate the information received from internal and external systems into the recording system, which usually needs to be formatted, verified and processed in a transactional manner. Batch processing is used to process billions of transactions for enterprises every day.
Business scenario:
- Submit batch periodically
- Concurrent batch: parallel processing of jobs
- Phased, enterprise message driven processing
- Massively parallel batch processing
- Manual or scheduled restart after failure
- Sequential processing of dependent steps (extended to workflow driven batch processing)
- Partial processing: skipping records (for example, when rolling back)
- Batch transactions, applicable to small batch or existing stored procedures / scripts
In short, what Spring batch can do:
- Reads a large number of records from a database, file, or queue.
- Process data in some way.
- Write back the data in the modified form.
1.3 infrastructure
1.4 core concepts and abstractions
Core concept: a Job has one to many steps, and each Step has exactly one ItemReader, one ItemProcessor and one ItemWriter. The Job needs to be started (using JobLauncher) and metadata about the currently running process needs to be stored (in JobRepository).
2 Introduction to each component
2.1 Job
A Job is an entity that encapsulates the entire batch process. Like other Spring projects, a Job is linked to an XML configuration file or Java based configuration. This configuration can be referred to as "Job configuration".
Configurable items:
- Simple name of the job.
- Definition and sorting of Step instances.
- Whether the job can be restarted.
2.2 Step
A Step is a domain object that encapsulates an independent, continuous phase of a batch Job. Therefore, each Job consists entirely of one or more steps. A Step contains all the information needed to define and control the actual batch.
A StepExecution represents an attempt to execute one Step at a time. StepExecution each time Step runs, a new will be created, similar to JobExecution.
2.3 ExecutionContext
An ExecutionContext represents a collection of key / value pairs persisted and controlled by the framework to allow developers to have a place to store persistent states in the range of StepExecution objects or JobExecution objects.
2.4 JobRepository
JobRepository is the persistence mechanism for all the Stereotypes mentioned above. It provides CRUD operation, JobLauncher, Job and Step implementation. When a Job is started for the first time, a JobExecution is obtained from the repository, and the implementation of StepExecution and JobExecution is continued by passing them to the repository.
When using Java configuration, the @ EnableBatchProcessing annotation provides a JobRepository as one of the automatically configured components out of the box.
2.5 JobLauncher
JobLauncher represents a simple interface for a Job to start JobParameters using a given set, as shown in the following example:
public interface JobLauncher { public JobExecution run(Job job, JobParameters jobParameters) throws JobExecutionAlreadyRunningException, JobRestartException, JobInstanceAlreadyCompleteException, JobParametersInvalidException; }
Expect to implement JobExecution, get a valid JobRepository from it, and execute the Job.
2.6 Item Reader
ItemReader is an abstraction that represents the input to retrieve Step one item at a time. When ItemReader runs out of items it can provide, it indicates this by returning null.
2.7 Item Writer
ItemWriter is an abstraction that represents the output of a Step, batch, or block of items at a time. Usually, anItemWriter does not know the input that it should receive next, and only knows the items that are passed in its current call.
2.8 Item Processor
ItemProcessor is an abstraction that represents the business processing of an item. When an ItemReader reads an item and an ItemWriter writes to it, it provides an access point to the ItemProcessor to transform or apply other business processes. If it is determined that the item is invalid when processing the item, null is returned, indicating that the item should not be written out.
3 Spring Batch practice
Next, we will use the theory we have learned to implement the simplest Spring Batch batch project
3.1 dependency and project structure and configuration files
rely on
<!--Spring batch--> <dependency> <groupId>org.springframework.boot</groupId> <artifactId>spring-boot-starter-batch</artifactId> </dependency> <!-- web rely on--> <dependency> <groupId>org.springframework.boot</groupId> <artifactId>spring-boot-starter-web</artifactId> </dependency> <!-- lombok--> <dependency> <groupId>org.projectlombok</groupId> <artifactId>lombok</artifactId> <version>1.18.20</version> </dependency> <!-- mysql--> <dependency> <groupId>mysql</groupId> <artifactId>mysql-connector-java</artifactId> <version>5.1.47</version> </dependency> <!-- mybatis--> <dependency> <groupId>com.baomidou</groupId> <artifactId>mybatis-plus-boot-starter</artifactId> <version>3.2.0</version> </dependency>
Project structure
configuration file
server.port=9000 spring.datasource.url=jdbc:mysql://localhost:3306/test spring.datasource.username=root spring.datasource.password=12345 spring.datasource.driver-class-name=com.mysql.jdbc.Driver
3.2 codes and data sheets
data sheet
CREATE TABLE `student` ( `id` int(100) NOT NULL AUTO_INCREMENT, `name` varchar(45) DEFAULT NULL, `age` int(2) DEFAULT NULL, `address` varchar(45) DEFAULT NULL, PRIMARY KEY (`id`), UNIQUE KEY `id_UNIQUE` (`id`) ) ENGINE=InnoDB AUTO_INCREMENT=203579 DEFAULT CHARSET=utf8 ROW_FORMAT=REDUNDANT
Student entity class
/** * @desc: Student Entity class * @author: YanMingXin * @create: 2021/10/15-12:17 **/ @Data @Accessors(chain = true) @NoArgsConstructor @AllArgsConstructor @ToString @TableName("student") public class Student { @TableId(value = "id", type = IdType.AUTO) private Long sId; @TableField("name") private String sName; @TableField("age") private Integer sAge; @TableField("address") private String sAddress; }
Mapper layer
/** * @desc: Mapper layer * @author: YanMingXin * @create: 2021/10/15-12:17 **/ @Mapper @Repository public interface StudentDao extends BaseMapper<Student> { }
Read class in simulation database (file)
/** * @desc: Read from simulation database * @author: YanMingXin * @create: 2021/10/16-10:13 **/ public class StudentVirtualDao { /** * Simulate reading from database * * @return */ public List<Student> getStudents() { ArrayList<Student> students = new ArrayList<>(); students.add(new Student(1L, "zs", 23, "Beijing")); students.add(new Student(2L, "ls", 23, "Beijing")); students.add(new Student(3L, "ww", 23, "Beijing")); students.add(new Student(4L, "zl", 23, "Beijing")); students.add(new Student(5L, "mq", 23, "Beijing")); students.add(new Student(6L, "gb", 23, "Beijing")); students.add(new Student(7L, "lj", 23, "Beijing")); students.add(new Student(8L, "ss", 23, "Beijing")); students.add(new Student(9L, "zsdd", 23, "Beijing")); students.add(new Student(10L, "zss", 23, "Beijing")); return students; } }
Service layer interface
/** * @desc: * @author: YanMingXin * @create: 2021/10/15-12:16 **/ public interface StudentService { List<Student> selectStudentsFromDB(); void insertStudent(Student student); }
Service layer implementation class
/** * @desc: Service Layer implementation class * @author: YanMingXin * @create: 2021/10/15-12:16 **/ @Service public class StudentServiceImpl implements StudentService { @Autowired private StudentDao studentDao; @Override public List<Student> selectStudentsFromDB() { return studentDao.selectList(null); } @Override public void insertStudent(Student student) { studentDao.insert(student); } }
The core configuration class is BatchConfiguration
/** * @desc: BatchConfiguration * @author: YanMingXin * @create: 2021/10/15-12:25 **/ @Configuration @EnableBatchProcessing @SuppressWarnings("all") public class BatchConfiguration { /** * Inject JobBuilderFactory */ @Autowired public JobBuilderFactory jobBuilderFactory; /** * Inject StepBuilderFactory */ @Autowired public StepBuilderFactory stepBuilderFactory; /** * Inject JobRepository */ @Autowired public JobRepository jobRepository; /** * Inject JobLauncher */ @Autowired private JobLauncher jobLauncher; /** * Inject custom StudentService */ @Autowired private StudentService studentService; /** * Inject custom job */ @Autowired private Job studentJob; /** * Encapsulate writer bean * * @return */ @Bean public ItemWriter<Student> writer() { ItemWriter<Student> writer = new ItemWriter() { @Override public void write(List list) throws Exception { //debug found that the thread of the nested List reader nested the real List list.forEach((stu) -> { for (Student student : (ArrayList<Student>) stu) { studentService.insertStudent(student); } }); } }; return writer; } /** * Encapsulate reader bean * * @return */ @Bean public ItemReader<Student> reader() { ItemReader<Student> reader = new ItemReader() { @Override public Object read() throws Exception, UnexpectedInputException, ParseException, NonTransientResourceException { //Analog data acquisition StudentVirtualDao virtualDao = new StudentVirtualDao(); return virtualDao.getStudents(); } }; return reader; } /** * Encapsulating processor bean s * * @return */ @Bean public ItemProcessor processor() { ItemProcessor processor = new ItemProcessor() { @Override public Object process(Object o) throws Exception { //debug found that o is the data read by the reader in a single thread return o; } }; return processor; } /** * Encapsulate custom step * * @return */ @Bean public Step studentStepOne() { return stepBuilderFactory.get("studentStepOne") .chunk(1) .reader(reader()) //Join reader .processor(processor()) //Join processor .writer(writer())//Join writer .build(); } /** * Encapsulating custom job s * * @return */ @Bean public Job studentJob() { return jobBuilderFactory.get("studentJob") .flow(studentStepOne())//Join step .end() .build(); } /** * Scheduled task execution using spring */ @Scheduled(fixedRate = 5000) public void printMessage() { try { JobParameters jobParameters = new JobParametersBuilder() .addLong("time", System.currentTimeMillis()) .toJobParameters(); jobLauncher.run(studentJob, jobParameters); } catch (Exception e) { e.printStackTrace(); } } }
3.3 testing
1 s after project start
Looking at the database, there are so many tables in addition to the tables defined by our entity classes. These tables are the log and error recording tables of spring batch. The specific meaning of the fields needs to be studied
4. Summary after actual combat
Spring Batch has very fast write and read speeds, but the impact is that it consumes memory and database connection pool resources. If it is not used well, exceptions will occur. Therefore, we need to configure it correctly. Next, let's explore the source code:
4.1 JobBuilderFactory
Job acquisition uses the simple factory mode and builder mode. JobBuilderFactory obtains an instance of job object returned by JobBuilder after configuration. This instance is the top-level component in Spring Batch, including n and step
public class JobBuilderFactory { private JobRepository jobRepository; public JobBuilderFactory(JobRepository jobRepository) { this.jobRepository = jobRepository; } //Return to JobBuilder public JobBuilder get(String name) { JobBuilder builder = new JobBuilder(name).repository(jobRepository); return builder; } }
jobBuilder class
public class JobBuilder extends JobBuilderHelper<JobBuilder> { /** * Create a new builder for the job with the specified name */ public JobBuilder(String name) { super(name); } /** * Create a new job builder that will execute a step or sequence of steps. */ public SimpleJobBuilder start(Step step) { return new SimpleJobBuilder(this).start(step); } /** * Create a new job builder that will execute the flow. */ public JobFlowBuilder start(Flow flow) { return new FlowJobBuilder(this).start(flow); } /** * Create a new job builder that will execute a step or sequence of steps */ public JobFlowBuilder flow(Step step) { return new FlowJobBuilder(this).start(step); } }
4.2 StepBuilderFactory
Look directly at the StepBuilder class
public class StepBuilder extends StepBuilderHelper<StepBuilder> { public StepBuilder(String name) { super(name); } /** * Building a step with a custom micro thread is not necessarily a processing item. */ public TaskletStepBuilder tasklet(Tasklet tasklet) { return new TaskletStepBuilder(this).tasklet(tasklet); } /** * Build a step to process items in blocks according to the size provided. To extend this step to fault tolerance, * Call the faultolerant() method of simplestapbuilder on the builder. * @param <I> Input type * @param <O> type of output */ public <I, O> SimpleStepBuilder<I, O> chunk(int chunkSize) { return new SimpleStepBuilder<I, O>(this).chunk(chunkSize); } public <I, O> SimpleStepBuilder<I, O> chunk(CompletionPolicy completionPolicy) { return new SimpleStepBuilder<I, O>(this).chunk(completionPolicy); } public PartitionStepBuilder partitioner(String stepName, Partitioner partitioner) { return new PartitionStepBuilder(this).partitioner(stepName, partitioner); } public PartitionStepBuilder partitioner(Step step) { return new PartitionStepBuilder(this).step(step); } public JobStepBuilder job(Job job) { return new JobStepBuilder(this).job(job); } /** * Create a new step builder that will execute the flow. */ public FlowStepBuilder flow(Flow flow) { return new FlowStepBuilder(this).flow(flow); } }
Reference documents:
https://docs.spring.io/spring-batch/docs/4.3.x/reference/html/index.html
https://www.jdon.com/springbatch.html