In Redis, executeSLAVEOF
(orREPLICAOF
After the ) command, the slave node needs toClear existing data and resyncThe main reasons are as follows:
1. Ensure data consistency
Core objectives: Ensure the slave node's data and master nodeCompletely consistent。
Problem scenario:
- If the slave node has other data before (for example, it was a copy of another master node, or it was an independent master node itself), keeping the data directly will cause a mix of old and new data.
- The data state of the master node may conflict with the slave node (for example, the same key but different values), resulting in data logic errors.
2. Trigger conditions for full synchronization
When executing slave nodeSLAVEOF
When connecting to a master node, Redis triggers the following two synchronization mechanisms:
(1) Full Sync
Trigger condition:
- The slave node isFirst connectionto the master node.
- Master-slaveCopy IDMismatch (for example, a master node fails over).
- From the nodeCopy offset (
repl_offset
)Replication backlog buffer not in the master node (repl_backlog
) within the scope.
Operation process:
- The master node generates the current dataRDB snapshot, send to the slave node.
- Slave nodeClear your own data, load the RDB file.
- The master node will generate a new write command cache during RDB and send it to the slave node after RDB transmission is completed (incremental synchronization).
(2) Partial Sync
Trigger condition:
- Master-slaveCopy IDConsistent.
- The replication offset of the slave node is still in the master node's
repl_backlog
Within range.
Operation process:
- The master node directly sends the missing incremental commands of the slave node (no data need to be cleared).
- The slave node applies these commands to catch up with the master node state.
3. Necessity to clear data
- Data must be cleared for full synchronization:
The slave node needs to reconstruct the data set based on the RDB snapshot of the master node. If the original data is retained, the data will be inconsistent.
# Example: Automatically execute FLUSHALL before loading RDB from a node[Slave node log] MASTER <-> REPLICA sync: Flushing old data
- No need to clear data in part synchronization:
The incremental command is appended based on the data state that the node already has, so it is safe to retain data.
4. Risk of data consistency
Scene | risk |
---|---|
Not clear the data + full synchronization | The master node RDB data is mixed with the slave node's old data, resulting in problems such as key overwrite and missed expiration time. |
Not clearing data + partial synchronization | Safe only if the copy ID and offset match, otherwise the data may be incomplete or logically conflicting. |
How to avoid full synchronization (reduce the overhead of clearing the library)
(1) Reasonable configuration repl-backlog-size
- Increase the replication backlog buffer of the master node (default 1MB), allowing for a longer period of disconnection to trigger partial synchronization:
#Master node configuration ()repl-backlog-size 64mb # Adjustment according to business write volume
(2) Avoid frequent master-slave switching
- Reduce the number of failovers of the master node (such as optimizing Sentinel parameters
down-after-milliseconds
) to avoid copy ID changes.
(3) Persistence replication ID and offset
- When the slave node restarts, if the replication ID and offset are still valid, partial synchronization can be triggered:
#Configuration from node ()repl-diskless-sync no # Enable disk backup(default)
Example: Log analysis of synchronization process
(1) Fully synchronous log
# Master node log
[19042] 01 Jan 12:00:00.123 * Replica 127.0.0.1:6380 asks for synchronization
[19042] 01 Jan 12:00:00.123 * Full resync requested by replica 127.0.0.1:6380
[19042] 01 Jan 12:00:00.123 * Starting BGSAVE for SYNC with target: disk# Slave node log
[19043] 01 Jan 12:00:00.125 * MASTER <-> REPLICA sync started
[19043] 01 Jan 12:00:00.125 * MASTER <-> REPLICA sync: Flushing old data
[19043] 01 Jan 12:00:00.125 * MASTER <-> REPLICA sync: Loading DB in memory
(2) Partial synchronization log
# Master node log
[19042] 01 Jan 12:00:00.123 * Replica 127.0.0.1:6380 requests partial resynchronization
[19042] 01 Jan 12:00:00.123 * Partial resynchronization request accepted# Slave node log
[19043] 01 Jan 12:00:00.125 * MASTER <-> REPLICA sync: Master accepted a Partial Resynchronization
Summarize
- Data must be cleared for full synchronization: Ensure that the slave node uses the master's RDB snapshot as the benchmark to avoid data inconsistency.
- Partial synchronization does not need to be cleared: Add incremental commands based on copy backlog buffers to preserve data security.
-
Optimization suggestions: By adjusting
repl-backlog-size
and reduce the frequency of master-slave switching, try to avoid full synchronization, and reduce the impact of clearing the database on the service.
The above is personal experience. I hope you can give you a reference and I hope you can support me more.