Spring scheduling annotation @Scheduled method (including distributed)

Brief description

Task scheduling is to execute business logic at a given time or a fixed frequency, which is a relatively common functional requirement.

Solutions include jdk native Timer, ScheduledThreadPoolExecutor, etc. These classes are often suitable for some embedded business logic scenarios.

This article mainly introduces the annotation @Scheduled. The above are all single-process solutions. After appropriate transformation, they can also be applied to distributed scenarios and can meet most scheduling business scenarios. The specific implementation ideas will be briefly described below.

Configuration

Open

When the project starts the scheduling function, you need to add annotation @EnableScheduling first, otherwise the scheduling annotation @Scheduled will not work.

Thread pool

Since it is a task running, it will involve thread processing. If there are different types of tasks, parallel processing will also occur. The reasonable management of threads is inseparable from thread pools. The following is the thread pool configuration sorting.

(1) Not configured (default)

If no configuration is done, spring-boot will automatically build a ThreadPoolTaskScheduler thread pool class bean by default to manage the threads running tasks. For the specific parameter values of the default thread pool, you can refer to the default values defined by the TaskSchedulingProperties class, as follows:

// pool
private int size = 1;

// thread
private String threadNamePrefix = "scheduling-";

Through the source code, this default thread pool is actually processed internally by jdk's ScheduledThreadPoolExecutor class. This class uses an unlimited capacity queue, which limits its maximum number of threads to not exceed 1. If there are time-consuming parallel tasks, it cannot meet the requirements. Generally, these parameters need to be reconfigured according to the business scenario.

(2) Spring configuration

The spring-boot project has provided the TaskSchedulingAutoConfiguration class, which automatically loads the thread pool configuration parameters and builds the ThreadPoolTaskScheduler thread pool class bean. The following are the convention configuration items:

spring:
  task:
    scheduling:
      threadNamePrefix: my-scheduler-task-
      pool:
        size: 3

The size of the thread pool is based on the number of @Scheduled tasks. In principle, if there are several types of tasks, several threads will be needed, otherwise there will be mutual influence. Long time-consuming tasks occupy threads, resulting in short time-consuming tasks not being able to run normally.

(3) Java code configuration

Scheduling tasks are not like @Async exception handling. They only have one thread pool, and generally do not use this configuration method. The following is a simple example.

@Configuration
public class ScheduleConfig {
	
    private static final String THREAD_NAME_PREFIX = "my-scheduler-task-";	

    @Bean("myTaskScheduler")
    public ThreadPoolTaskScheduler getThreadPoolTaskScheduler() {
    	ThreadPoolTaskScheduler result = new ThreadPoolTaskScheduler();
    	(THREAD_NAME_PREFIX);
    	(3);
    	return result;
    }
}

Scheduling rules

@Scheduled contains parameters:

cron: Timed tasks, according to the cron expression rules, run tasks regularly, for example, run every 5 minutes: 0/5 * * * * ?
fixedDelay: Execute at a fixed interval, which is the interval between two adjacent tasks, the end of the previous task and the beginning of the next task, in milliseconds.
fixedRate: Perform tasks at a fixed frequency, unit: milliseconds.
initialDelay: How long does it take to run the first task after the system is started, unit: milliseconds.

Among them: cron, fixedDelay, fixedRate configuration parameters, you can only choose one of three.

distributed

Now that most systems are deployed in distributed environments, we need to consider how to coordinate the task execution problem of multi-instance deployment. The following are common solutions and personal thoughts.

Third party

Currently, third-party open source solutions include the classic Quartz in the early days. The version iteration is not very active in recent years, and the up-and-coming XXL-JOB version iteration is relatively active. It is also a solution that many companies currently recommend. It has a complete function of task management, monitoring, logging, etc., so you can refer to its official information. I will not describe it more here.

Self-processing

Although the open source third-party solutions above are mature and perfect enough, they are relatively heavy. For some systems, they are not very large and simple task scheduling needs, they can be simply transformed to meet these task scheduling functions.

Although it is simple, it can be very practical and robust. The following are 2 ways of handling with redis.

(1) @Scheduled as the main one, redis as the auxiliary one

The scheduling tasks annotated by @Scheduled are run in a distributed environment. An obvious problem is that the same task may be executed concurrently on multiple machines at the same time. How to avoid it? It is natural to think of redis distributed lock processing to avoid concurrent tasks. The locking time can be set to 0.75 execution cycles. The following isPseudo code：

	@Scheduled(fixedDelay = 60000, initialDelay = 1000)
	public void task1() {
		
		// Lock		boolean isLock = ("my-task-1", 60000 * 0.75);
		if (!isLock) return;
		
		// Task logic		doSomething();
	}

It can be seen that in this way, the task cycle error is relatively large and rough, and its characteristic is that it is simple in logic and is suitable for scenarios with low accuracy requirements.

(2) Redis is the main one, @Scheduled is the auxiliary one

Since the execution cycle is configured through @Scheduled, it is difficult to ensure the accuracy of the cycle in a distributed environment. At this time, @Scheduled can be used as only a timed scanning task to try to apply for execution. The real execution cycle is managed by the expiration time of redis. In this way, the task cycle accuracy will be much better. The following isPseudocode:

Perform at a fixed frequency:

	/*
	  * redis is the main, @Scheduled is the auxiliary (execute tasks according to fixed frequency)
	  *
	  * note:
	  * a. The fixedDelay in the @Scheduled annotation is only used as an attempt to apply for execution of tasks, and can usually be set smaller.
	  * b. Task execution period or interval, the value is the time when redisLock is locked.
	  *
	  */
	@Scheduled(fixedDelay = 5000, initialDelay = 1000)
	public void task2() {
		
		// Lock		boolean isLock = ("my-task-2", Real mission cycle);
		if (!isLock) return;
		
		// Task logic		doSomething();
		
	}

Perform at fixed intervals:

	/*
	  * redis is the main, @Scheduled is the auxiliary (execute at fixed intervals)
	  *
	  * note:
	  * a. The fixedDelay in the @Scheduled annotation is only used as an attempt to apply for execution of tasks, and can usually be set smaller.
	  * b. Task execution period or interval, the value is the time when redisLock is locked.
	  *
	  */
	@Scheduled(fixedDelay = 5000, initialDelay = 1000)
	public void task3() {
		
		// Lock 1: Avoid tasks parallelism		boolean isLock = ("my-task-3", Real mission interval);
		if (!isLock) return;
		
		// Task logic		doSomething();
		
		// Lock 2: Interval time		("my-task-3", Real mission interval);
		
	}

Execute according to the cron expression: the periodic accuracy can be adjusted by annotating the @Scheduled parameter fixedDelay.

	/*
	  * redis is the main one, @Scheduled is the auxiliary one (cron expression)
	  *
	  * note:
	  * a. The fixedDelay in the @Scheduled annotation is only used as an attempt to apply for execution of tasks, and can usually be set smaller.
	  * b. Task execution period or interval, the value is the time when redisLock is locked.
	  * c. CronHelper parses the cron expression and calculates the next run interval
	  */
	@Scheduled(fixedDelay = 5000, initialDelay = 1000)
	public void task4()  {
		
		// Lock		boolean isLock = ("my-task-4", ());
		if (!isLock) return;
		
		// Task logic		doSomething();		
	}

The above is just pseudo-code, which shows that the transformation cost is relatively small and flexible enough. Among them, RedisLock can refer to the article compiled by the previous article: Distributed Lock-java. As for the CronHelper class, there should be similar resources on the Internet, so you might as well implement it yourself. It should be much more interesting than the sorting algorithm.

Next is the operation of tasks, which cannot guarantee load balancing. If there is indeed this requirement, it can also be achieved through the redis queue, and the logic will not be too complicated.

Personally, I believe:

This self-processing method can still ensure its high availability and concurrency performance with the help of redis. Its main disadvantage is that the code semantics are not clear enough. In terms of maintenance, it is easily affected by the annotation @Scheduled timing parameters. In actual business scenarios, try to encapsulate them as much as possible to improve readability.

Frequently Asked Questions

(1) The size of the thread pool is recommended to have a few threads, which is a waste if there are too many. If it is too small, the task will take a long time, inter-task interference will occur.

(2) If the task has strict parallel restrictions, you can protect it through distributed locks.

Summarize

The above is personal experience. I hope you can give you a reference and I hope you can support me more.