SESSION ON-DEMAND

All Things P99

The event for developers who care about P99 percentiles and high-performance, low-latency applications

Square’s Lessons Learned from Implementing a Key-Value Store with Raft

To put it simply, Raft is used to make a use case (e.g., key-value store, indexing system) more fault tolerant to increase availability using replication (despite server and network failures). Raft has been gaining ground due to its simplicity without sacrificing consistency and performance.

Although we’ll cover Raft’s building blocks, this is not about the Raft algorithm; it is more about the micro-lessons one can learn from building fault-tolerant, strongly consistent distributed systems using Raft. Things like majority agreement rule (quorum), write-ahead log, split votes & randomness to reduce contention, heartbeats, split-brain syndrome, snapshots & logs replay, client requests dedupe & idempotency, consistency guarantees (linearizability), leases & stale reads, batching & streaming, parallelizing persisting & broadcasting, version control, and more!

And believe it or not, you might be using some of these techniques without even realizing it! This is inspired by Raft paper (raft.github.io), publications & courses on Raft, and an attempt to implement a key-value store using Raft as a side project.

18 minutes
Register for access to all 60+ sessions available on demand.
Fill out the form to watch this session from the P99 CONF 2024 livestream. You’ll also get access to all available recordings.

Omar Elgabry, Software Engineer at Square

A software engineer (B.S. CS & SWE, Jul '15), a writer, a teacher, a hackathon winner, with a polymorphic personality, born in Egypt, lived and worked in India, Turkey, and currently Canada.