DK
Deniz Kartal
Back to posts

Distributed Consensus: Raft vs Paxos

2025-01-102 min read
computer-scienceengineering

Distributed consensus is one of the fundamental problems in computer science. Two prominent solutions, Paxos and Raft, take different approaches to achieving the same goal: getting multiple nodes to agree on a value.

The Consensus Problem

In a distributed system, we need:

  • Agreement: All nodes decide on the same value
  • Validity: The decided value was proposed by some node
  • Termination: All non-faulty nodes eventually decide

Paxos: The Classic Approach

Paxos, introduced by Leslie Lamport, is notoriously difficult to understand. It operates in phases:

  1. Prepare phase: Proposer sends a prepare request
  2. Promise phase: Acceptors respond with promises
  3. Accept phase: Proposer sends accept requests
  4. Accepted phase: Acceptors accept the value

The complexity comes from handling failures and concurrent proposals.

Raft: Understandability First

Raft was designed with understandability as a primary goal. It decomposes consensus into:

  1. Leader election: One node becomes the leader
  2. Log replication: Leader replicates log entries
  3. Safety: Ensuring correctness even with failures

The key insight is that having a strong leader simplifies the protocol significantly.

Implementation Considerations

When implementing consensus:

  • Raft is generally easier to implement correctly
  • Paxos can be more flexible in certain scenarios
  • Both require careful handling of network partitions
  • Performance characteristics vary by workload

Real-World Usage

  • etcd and Consul use Raft
  • Google Chubby uses Paxos
  • ZooKeeper uses Zab (similar to Paxos)

The choice often depends on the specific requirements and the team's familiarity with the algorithm.