Redis persistence mechanism

Keywords: Database Redis

Persistence

Redis supports two persistence mechanisms: RDB(Redis DataBase) and AOF(Append Only File). The persistence function effectively avoids the problem of data loss caused by process exit. Data recovery can be realized by using the previously persistent file when restarting next time. (redis data is stored in memory. In case of sudden downtime, data will be lost. Therefore, we need a mechanism to ensure that redis data will not be lost even when it is down. At this time, redis persistence mechanism comes into play).

Suggest to see Official documents explain.

RDB

Official introduction: write the data set Snapshot in memory to disk within the specified time interval, that is, the jargon Snapshot. When it is restored, it reads the Snapshot file directly into memory.

RDB principle?

When saving RDB files, redis will separately create (fork) a child process for persistence. At this time, the parent process does not need to do other IO operations. Fork means that redis performs RDB operations by creating child processes. cow means copy on write. After a child process is created, the parent and child processes share data segments. The parent process continues to provide read-write services, and dirty page data will be gradually separated from the child process.

Disadvantages of RDB: the data after the last persistence may be lost, so we generally use RDB for image full persistence and AOF for incremental persistence.

fork: the function is to copy a process that is the same as the current process. All the data of the new process is consistent with the original process, but it is a new process and serves as a child process of the original process.

How to trigger RDB snapshots?

(1) Configuration file (factory configured by default)

################################ SNAPSHOTTING  ################################
# Unless specified otherwise, by default Redis will save the DB:
#   * After 3600 seconds (an hour) if at least 1 key changed
#   * After 300 seconds (5 minutes) if at least 100 keys changed
#   * After 60 seconds if at least 10000 keys changed
#
# You can set these explicitly by uncommenting the three following lines.
# Guangshang defaults to the following three types (which can be modified)
# 1. Once the key is changed within 1 hour, the snapshot mechanism will be triggered
# 2. If the key is changed 100 times within 5 minutes, the snapshot mechanism is triggered
# 3. If the key is changed 10000 times within 1 minute, the RDB snapshot mechanism is triggered
save 3600 1
save 300 100
save 60 10000

After setting the value, the dump.rdb file will be generated. It is worth noting that the dump.rdb file generated must be backed up to another backup machine, otherwise the file will be lost if you go down. In addition, after the RDB is triggered for the first time, the dump.rdb generated for the first time will be overwritten when the RDB is touched for the second time.

(2) How to recover?

Move dump.rdb from the backup machine to the Redis installation directory, and start the server to recover. (in this way, the memory can be read again)

(3) The command save or bgsave will quickly produce the dump.rdb file for reference

  • When using save, it just saves and blocks everything else
  • When bgsave is used, Redis will take snapshots asynchronously in the background (that is, it can also respond to client requests while taking snapshots)
  • What is the difference between save and bgsave? The Save command blocks the server, while the bgsave command does not

Advantages and disadvantages of RDB?

Advantages: suitable for large-scale data recovery; Low requirements for data integrity and consistency

Disadvantages: all modifications after the last snapshot will be lost; When Fork, the data in memory is cloned to double the expansion performance.

AOF(Append Only File)

AOF persistence records the database status by saving the write commands executed by the redis server. Only files can be added, but files cannot be modified. After redis restarts, complete the data recovery according to the saved record instructions.

configuration file

############################## APPEND ONLY MODE ###############################
# ......

# The persistence of AOF and RDB can be enabled at the same time without any problem
# AOF and RDB persistence can be enabled at the same time without problems.
# If the AOF is enabled on startup Redis will load the AOF, that is the file
# with the better durability guarantees.
#
# Please check https://redis.io/topics/persistence for more information.

# 1. By default, AOF persistence is off
# appendonly no
# open
appendonly yes

# 2. The default AOF name is appendonly.aof
# The name of the append only file (default: "appendonly.aof")
appendfilename "appendonly.aof"

# The fsync() call tells the Operating System to actually write data on disk
# instead of waiting for more data in the output buffer. Some OS will really flush
# data on disk, some other OS will just try to do it ASAP.
#
# Redis supports three different modes:
#
# no: don't fsync, just let the OS flush the data when it wants. Faster.
# always: fsync after every write to the append only log. Slow, Safest.
# everysec: fsync only one time every second. Compromise.
#
# Default per second
# The default is "everysec", as that's usually the right compromise between
# speed and data safety. It's up to you to understand if you can relax this to
# "no" that will let the operating system flush the output buffer when
# it wants, for better performances (but if you can live with the idea of
# some data loss consider the default persistence mode that's snapshotting),
# or on the contrary, use "always" that's very slow but a bit safer than
# everysec.
#
# More details please check the following article:
# http://antirez.com/post/redis-persistence-demystified.html
#
# If unsure, use "everysec".

# 3,appendfsync
# 3.1 always: in synchronous persistence, every data change will be sent to disk immediately. This method has poor performance but ensures data integrity
# 3.2 everysec: factory default recommendation, asynchronous operation, recorded every second. If the machine goes down within one second, there will be data loss
# 3.3 no
# appendfsync always
appendfsync everysec
# appendfsync no

# When the AOF fsync policy is set to always or everysec, and a background
# saving process (a background save or AOF log background rewriting) is
# performing a lot of I/O against the disk, in some Linux configurations
# Redis may block too long on the fsync() call. Note that there is no fix for
# this currently, as even performing fsync in a different thread will block
# our synchronous write(2) call.
#
# In order to mitigate this problem it's possible to use the following option
# that will prevent fsync() from being called in the main process while a
# BGSAVE or BGREWRITEAOF is in progress.
#
# This means that while another child is saving, the durability of Redis is
# the same as "appendfsync none". In practical terms, this means that it is
# possible to lose up to 30 seconds of log in the worst scenario (with the
# default Linux settings).
#
# If you have latency problems turn this to "yes". Otherwise leave it as
# "no" that is the safest pick from the point of view of durability.

no-appendfsync-on-rewrite no

# Automatic rewrite of the append only file.
# Redis is able to automatically rewrite the log file implicitly calling
# BGREWRITEAOF when the AOF log size grows by the specified percentage.
#
# This is how it works: Redis remembers the size of the AOF file after the
# latest rewrite (if no rewrite has happened since the restart, the size of
# the AOF at startup is used).
#
# This base size is compared to the current size. If the current size is
# bigger than the specified percentage, the rewrite is triggered. Also
# you need to specify a minimal size for the AOF file to be rewritten, this
# is useful to avoid rewriting the AOF file even if the percentage increase
# is reached but it is still pretty small.
#
# Specify a percentage of zero in order to disable the automatic AOF
# rewrite feature.

auto-aof-rewrite-percentage 100
auto-aof-rewrite-min-size 64mb

# An AOF file may be found to be truncated at the end during the Redis
...

After you open the configuration file, appendonly.aof will be generated. When you go down, restart the redis server and re execute all the records in appendonly.aof to recover data.

How to load AOF and RDB when they coexist?

First load the appendonly.aof file. If an error / exception occurs in appendonly.aof, we can execute: redis check AOF -- fix appendonly.aof to recover.

What is the configuration strategy of AOF?

# Default per second
# The default is "everysec", as that's usually the right compromise between
# speed and data safety. It's up to you to understand if you can relax this to
# "no" that will let the operating system flush the output buffer when
# it wants, for better performances (but if you can live with the idea of
# some data loss consider the default persistence mode that's snapshotting),
# or on the contrary, use "always" that's very slow but a bit safer than
# everysec.
#
# More details please check the following article:
# http://antirez.com/post/redis-persistence-demystified.html
#
# If unsure, use "everysec".

# 3,appendfsync
# 3.1 always: in synchronous persistence, every data change will be sent to disk immediately. This method has poor performance but ensures data integrity
# 3.2 everysec: factory default recommendation, asynchronous operation, recorded every second. If the machine goes down within one second, there will be data loss
# 3.3 no
# appendfsync always
appendfsync everysec
# appendfsync no

ReWrite

(1) What is ReWrite?

Aof adopts the file append method, and the file will become larger and larger. In order to avoid this situation, a new mechanism is added. When the size of the AOF file exceeds the set threshold, Redis will start the content compression of the AOF file and only retain the minimum instruction set that can recover the data. You can use the command bgrewriteaof.

(2) Rewriting principle

When the AOF file continues to grow and is too large, it will fork out a new process to rewrite the file, and then traverse the in memory data of the new process. Each record has a set statement.

(3) Trigger mechanism

Redis will record the AOF size during the last rewrite. In the default configuration, it will be triggered when the AOF file size is twice the size after the last rewrite and the file size is greater than the default 64M.

############################## APPEND ONLY MODE ###############################
...
# Specify a percentage of zero in order to disable the automatic AOF
# rewrite feature.

# By default 64mb, the large-scale Internet is not enough and will be very large
auto-aof-rewrite-percentage 100
auto-aof-rewrite-min-size 64mb

AOF advantages and disadvantages

Advantages:

  • Synchronization per second: appendfsync always synchronizes and persists. Every time a data change occurs, it will be immediately recorded to the disk. The performance is poor, but the data integrity is good
  • Synchronization per modification: appendfsync everysec asynchronous operation, recorded every second. If the machine goes down within 1 second, there will be data loss
  • Out of Sync: appendfsync no is out of sync

inferiority:

  • For the data of the same dataset, AOF files are much larger than RDB files, and the recovery speed is slower than RDB files
  • The running efficiency of AOF is slower than that of RDB, the efficiency of synchronization strategy per second is better, and the asynchronous efficiency is the same as that of RDB

RDB and AOF summary

(1) RDB persistence mode, which can snapshot data within a specified time interval

(2) The AOF persistence method records every operation written to the Redis server. When the server is restarted, these commands will be rewritten and executed to recover the original data.

(3) There are two ways to start synchronization: when it is started at the same time, Redis will give priority to loading AOF files to recover the original data when it is restarted (the reason for loading first: generally, the data saved in AOF files is more complete than RDB).

(4) Can only AOF be used? It is not recommended because RDB is more suitable for fast restart of backup database, and there are no potential AOF bug s,

More articles have been included by Github: https://github.com/niutongg/JavaLeague
WX search: Niu Tong senior

Posted by sdizier on Sun, 19 Sep 2021 04:13:00 -0700