redis self increment & timed failure in distributed environment

Requirements & business scenarios

  without requirements or business scenarios, talking about technology is a castle in the air~

Preconditions

● distributed deployment
● multiple instances

Business requirements

● for different businesses, there is a self incremented document number with the business ID.
● business ID of order No. rule + date + 4-digit self incrementing number
● the 4-digit self incrementing number indicates the of the day and is cleared in the early morning

design

   because there are multiple instances, distributed locks are required when operating self incrementing numbers. At the same time, it needs to be cleared in the early morning of the same day. It is easy to think of redis to cache a key value. The expiration time is until the early morning. At the same time, redis provides auto increment instructions for atomic operations. As for distributed locks, consider the red lock of reddsion.
Another point to consider is the concurrency problem at the moment of the early morning failure.
● red lock of reddsion solves the problem of distributed and multiple instance operation
● set an expiration time to the early morning for the key
● consider the concurrency problem in case of early morning failure
● ensure self increasing atomic operation

realization

Gets the number of milliseconds in the early morning of the next day

 public Long getNowToNextDayMilliseconds() {
        //Get current time
        Calendar calendar = Calendar.getInstance();
        //Current day + 1
        calendar.add(Calendar.DAY_OF_YEAR, 1);
        //Set hours, minutes, seconds and milliseconds to 0
        calendar.set(Calendar.HOUR_OF_DAY, 0);
        calendar.set(Calendar.SECOND, 0);
        calendar.set(Calendar.MINUTE, 0);
        calendar.set(Calendar.MILLISECOND, 0);
        //Subtract current time to get interpolation
        return (calendar.getTimeInMillis() - System.currentTimeMillis());
    }

format string

   the final output format is type+YYYYMMDD+4-bit self incrementing number

private static final SimpleDateFormat sdf = new SimpleDateFormat("yyyyMMdd");
    private String getCode(String type, String number) {
        String date = sdf.format(new Date());
        StringBuffer buffer = new StringBuffer();
        buffer.append(type)
              .append(date);
        for (int i = number.length(); i < 4; i++) {
            buffer.append("0");
        }
        buffer.append(number);
        return buffer.toString();
    }

Core logic

  public String getOrderCode(String key) {
        Object value = redisTemplate.opsForValue().get(key);
        //If the value exists, it will be incremented directly and return @ the first if   
        if (null != value) {
            return getCode(key, redisTemplate.opsForValue().increment(key).toString());
        }
        //If you can't get the value, it means that you may have reached the zero point, and the self increment fails. You need to set a 0 again
        //The distributed and local concurrency problems need to be considered, and the distributed problem can be solved through redsession lock
        //Local concurrency can be achieved through lock and tryLock. Lock puts while(true) in it. tryLock is used in this scheme
        //There are two branches in while(true). The first branch gets the lock and needs to be considered after entering. After successful execution, the thread in the critical area comes in, so it needs to be judged empty first
        //The second branch is that the lock cannot be obtained. Judge whether the set value of the thread that has obtained the lock is successful. If successful, it will be returned directly
        RLock lock = redissonClient.getLock(CommonConstant.ORDER_CODE_LOCK_KEY + key);
        try {
            while (true) {
                //The initialization value and expiration time are obtained when the lock is obtained. If the lock is not obtained, the key value continues to be obtained  
                if (lock.tryLock(CommonConstant.INTEGER_FIVE, TimeUnit.MICROSECONDS)) { // @Second if  
                    if (null == redisTemplate.opsForValue().get(key)) {  // @Third if  
                        redisTemplate.opsForValue().set(key, "0", getNowToNextDayMilliseconds(), TimeUnit.MILLISECONDS);
                    }
                    return getCode(key, redisTemplate.opsForValue().increment(key).toString());
                } else {
                    value = redisTemplate.opsForValue().get(key);
                    if (null != value) {// @Fourth if  
                        return getCode(key, redisTemplate.opsForValue().increment(key).toString());
                    }
                }
            }
        } catch (InterruptedException e) {
            throw new BizException(BasicDataExceptionEnum.ORDER_CODE_CREATE_FAIL);
        } finally {
            if (lock.isLocked() && lock.isHeldByCurrentThread()) {
                lock.unlock();
            }
        }

    }

  for the interpretation of this code, it's OK to focus on four if's

First if

   if the value exists, it will be returned directly by auto increment. The incr of redis itself is an atomic operation, and redis is a single thread, which can ensure thread safety. At the same time, it can also ensure that the value obtained in the case of multiple processes is unique.

Second if

  when the value does not exist, you need to set the value. This operation is not atomic, and in the case of distribution, the set of instance a may overwrite the set value of instance B. A distributed lock is required at this time. Redsession lock implements the AQS interface. You can try to obtain distributed locks through tryLock. If the lock is acquired successfully, proceed to the next step.

Third if

   even if the distributed lock is obtained successfully, the local concurrency problem needs to be considered, mainly the thread problem in the critical area. After the first thread to get the lock is executed, the lock will be released. At this time, the thread waiting in the critical area can get the lock and enter this logic, so it needs to be operated in space judgment.

Fourth if

   if the lock is not obtained, there is no need to continue to cycle to obtain the lock, because at this time, the thread that may have obtained the lock has set the initial value. So here's another air judgment operation.

test

  to ensure the rigor of the code, you need to design a concurrent scenario test

@Test
    public void get_order_code_multi_thread_test()throws Exception {
        //Delete the key to simulate the concurrent operation when the key does not exist
        redisTemplate.opsForValue().getOperations().delete("IS");
        CyclicBarrier barrier = new CyclicBarrier(100);
        CountDownLatch latch = new CountDownLatch(100);
        Set<String> result= new HashSet<>(100);
        for (int i = 0; i < 100; i++) {
            new Thread(()->{
                try {
                    barrier.await();
                } catch (InterruptedException e) {
                    e.printStackTrace();
                } catch (BrokenBarrierException e) {
                    e.printStackTrace();
                }
                String code = commonBiz.getOrderCode("IS");
                System.out.println(code);
                result.add(code);
                latch.countDown();
            }).start();
        }
        latch.await();
        System.out.println(result.size());
        Assert.assertTrue(result.size()==100);
    }

  100 threads are simulated here. The CyclicBarrier is used to ensure that 100 threads lose the operation of obtaining the order number at the same time. Then, count down latch is used to ensure that 100 threads have been executed. After judging the execution result, the order number obtained is put into a set. If the final set size is 100, it indicates that 100 threads have not repeated the order number obtained in the case of concurrency, and the execution is successful.

summary

   the difficulty of this requirement is actually the problem of multiple processes and the same thread setting at the same time when the key fails at this moment in the morning. Multiple processes use distributed locks to ensure that there is only one process operation. Set is not an operation. The main reason is that the judgment value of get is empty and a 0 value of set is entered, which is not an atomic operation. In fact, some sets provide atomic operation methods such as putIfAbsent(), so atomicity can only be guaranteed through locks. Here, we reuse the blocking of distributed locks to ensure the atomicity of getAndSet. At the same time, we need to consider the problem of critical area. We should not only focus on the first thread that gets the lock, but also consider the second thread that gets the lock after the first thread releases the lock.

Posted by phence on Sun, 24 Oct 2021 08:30:21 -0700