Analysis of Raft algorithm

original
2017/01/06 19:45
Reading number 1.5K

preface
In the previous article ZAB protocol and Paxos algorithm The consistency protocol ZAB in Zookeeper mentioned in is essentially a simplification and optimization of Paxos. It can be seen that the complexity of Paxos (mainly because there is no primary secondary relationship between multiple concurrent processes) and even the problem of live locks may occur, which makes the specific implementation more complex. The Raft consistency algorithm to be introduced below is precisely in this environment.
Raft is a consistent algorithm designed by Diego Ongaro and John Ousterhout of Stanford to make it easy to understand. In 2013, Raft published a paper: 《In Search of an Understandable Consensus Algorithm》 Up to now, there have been more than ten languages of Raft algorithm implementation framework, and the more famous one is etcd. Google's Kubernetes also uses etcd as his service discovery framework.

About Raft
Raft's design is mainly based on two goals: the first is comprehensibility. Under the premise of achieving the same function, comprehensibility is the first criterion; The second point is to achieve the certainty of the actual system. Raft pursues the clear definition of each technical detail, so as to achieve the clarity when implementing specific systems.
In order to achieve the above two goals, Raft decomposed the consistency problem into three small problems:
1. Leader election: select the Leader, who is responsible for responding to the client's request
2. Log replication: log replication, synchronization
3. Safety: security

Basic concepts
1. Role
Each server has three statuses: Leader, Follower, Candidate
Leader: There is only one server in Leader status in the cluster, which is responsible for responding to requests from all clients
Follower: All nodes are in the Follower status when they are just started, responding to the Leader's log synchronization request and the Candidate request
Candidate: The status that the Follower status server needs to transition to before it is ready to launch a new Leader election is the intermediate status between the Follower and the Leader
The conversion relationship between the three can be referred to the following figure (source online):

2. Term
In Raft, a concept that can be understood as a cycle is used, and Term is used as a cycle; Raft divides the execution time of the whole system into a sequence of several Terms (cycles) with different time interval lengths, and uses an increasing number as the number of Term; Each term starts from the Election. During this time, several servers in the Candidate state compete to generate new Leaders. There are two situations:
1. If a server becomes a Leader, it will become a new Leader in the next time
2. If no leader is elected, the term will increase progressively to start the election for a new term
For more intuitive reference, see the figure below (source online):

It can be said that every time the term increases, a new round of elections will take place. Raft ensures that there is at most one leader in a term; Let's take a look at three independent sub problems.

Raft protocol steps
1. Leader election
When the whole system is started, all servers are in the Follower state; If there is a leader in the system, the leader will periodically send heartbeat to tell other servers that it is a leader. If the follower does not receive any heartbeat information after a period of time, it can be considered that the leader does not exist, and leader election is required.
Before the election, Follower increases its Term number and changes the status to Candidate, and then sends RequestVote RPC to other servers in the cluster. This status lasts until any of the following three events occur:
1. It won the election: Candidate accepted the vote of most servers, became the Leader, and then sent the heartbeat to other servers to tell them.
2. Another server won the election: Candidate received an RPC message from the server calling itself Leader while waiting. If the term number of this RPC is greater than or equal to the term number of Candidate itself, Candidate acknowledges the Leader and its status becomes Follower; Otherwise, the leader is rejected and the status is still Candidate.
3. A period of time has passed and no new leader has been generated: in this case, the Term will increase and the election will be re launched; The reason why this happens is that it is possible for multiple Followers to change to Candidate status at the same time, resulting in diversion without obtaining a majority of votes.

2. Log replication
Log replication is mainly used to ensure the consistency of nodes. The operations in this phase are also used to ensure consistency and high availability; When the leader is elected, he/she will be responsible for the client's requests. All requests must be processed by the leader first. These requests or commands are also called logs here. After receiving the client command, the leader appends it to the tail of the log, and then issues AppendEntries RPC to other servers in the cluster, which causes other servers to copy the new command. When most servers copy, the leader applies the operation command to the internal state machine, and returns the execution result to the client.
The log structure is shown in the following figure (source network):

The items in each log contain two contents: the operation command itself and the term number; There is also a global Log Index to indicate the sequence number of Log items in the Log. When most servers store the project in the Log, it can be considered that the project can be submitted. For example, the project before the Log Index in the above figure is 7 can be submitted.

3. Safety
Security is a security mechanism used to ensure that each node executes the same sequence. For example, when a follower is unavailable when the current leader submits a command, the follower may later be elected as the leader. At this time, the new leader may overwrite the previously submitted log with a new log, which causes the node to execute different sequences; Security is a mechanism used to ensure that the elected leader must include the previously submitted log.
In order to achieve safety, Raft added two constraints:
1. It is required that only those servers whose logs contain all submitted operation commands can be selected as leaders.
2. For a new leader, only when he/she has submitted the operation command of the current Term can he/she be considered as a true submission.

summary
Compared with Paxos, Raft has certain advantages in understandability and clarity when implementing the system, which is why Raft algorithm has been widely used in just a few years; ZAB essentially simplifies and optimizes Paxos, so Raft and ZAB still have many similarities. You can compare them separately. This is intended to be compared in future articles.

Expand to read the full text
Loading
Click to lead the topic 📣 Post and join the discussion 🔥
Reward
zero comment
seven Collection
one fabulous
 Back to top
Top