From ae1497593c2769c0c076c2f9adbe3b812232a432 Mon Sep 17 00:00:00 2001
From: asahi <mikiyashiki@outlook.com>
Date: Tue, 14 Oct 2025 10:39:38 +0800
Subject: [PATCH] =?UTF-8?q?doc:=20=E9=98=85=E8=AF=BBredis=E6=96=87?=
 =?UTF-8?q?=E6=A1=A3?=
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

---
 中间件/redis/redis.md | 23 +++++++++++++++++++++++
 1 file changed, 23 insertions(+)

diff --git a/中间件/redis/redis.md b/中间件/redis/redis.md
index 0e4a5c1..b51586a 100644
--- a/中间件/redis/redis.md
+++ b/中间件/redis/redis.md
@@ -234,6 +234,9 @@
       - [Failure Dection](#failure-dection)
         - [`PFAIL` flag](#pfail-flag)
         - [`FAIL` flag](#fail-flag)
+    - [Configuration handling, propagation, and failovers](#configuration-handling-propagation-and-failovers)
+      - [Cluster current epoch](#cluster-current-epoch)
+      - [Configuration epoch](#configuration-epoch)
 
 
 # redis
@@ -3695,6 +3698,8 @@ redis cluster failure detection用于识别`a master or a replica is no longer r
   - 由于master node并不负责任何slot，故而且没有实际加入集群，故而其flag可以被正常清除，并在清除后等待后续对该节点的配置并加入集群
 - 该node为master，并且已经处于reachable状态，但是很长时间内(`N * NODE_TIMEOUT`)都没有探知到replica promotion。那么，在这种场景下，最好可以重新加入到该集群
 
+> 如果被标记为FAIL的节点符合上述的任一要求，那么其他节点可以清除该node的FAIL状态
+
 在`PFAIL`到`FAIL`的状态转换中，使用了agreement的形式，但是该一致性较弱:
 - node在固定的时间窗口内接收来自其他节点的视角信息，其在不同时间从不同的节点接收信息。
 - 当节点探知到`FAIL`场景时，其并不保证所有node都能接收到`FAIL message`，可能有些节点因为网络问题处于`network partition`的状态。
@@ -3705,3 +3710,21 @@ redis failure detection拥有如下要求：
 在如下两种场景中：
 - 如果`majority of masters`将该node标记为`FAIL`，由于failure detection和其产生的链式效应，最终所有其他节点都会将该节点标记为`FAIL`
 - 当仅有`minority of masters`将node标记为`FAIL`时，replica promotion并不会发生，并且每个节点都会清除FAIL状态
+
+FAIL状态只被用作触发replica promotion，理论上，一个replica可以独立运行，并在master不可被访问时触发replica promotion；如果master能够被majority of masters访问，那么masters将拒绝对replica promotion提供ack。
+
+### Configuration handling, propagation, and failovers
+#### Cluster current epoch
+在redis cluster中，使用了一个被称作`epoch`的概念，其用于向events提供递增的版本号。当多个节点提供的信息相冲突时，可以通过epoch判断哪个信息更加update to date。
+
+`currentEpoch`是一个64位的unsigned number。
+
+在redis cluster node创建时，无论是master还是replicas，其currentEpoch都是0.
+
+每当从其他节点接收到packet时，如果发送方的epoch要比local epoch更大，那么currentEpoch将会被更新为senderEpoch。
+
+epoch为cluster的逻辑时钟，并且，epoch越大则epoch对应的消息更优先(more update to date)。
+
+#### Configuration epoch
+对每个master节点，其都会在ping/pong packet中广播其configEpoch和`set of slots it serves`。
+