P99 CONF 2025 is coming Oct 22-23! Call for speakers is open.

SESSION ON-DEMAND

All Things P99

The event for developers who care about P99 percentiles and high-performance, low-latency applications

Square Engineering’s “Fail Fast, Retry Soon” Performance Optimization Technique

Learn how to build resilient systems, reduce failure rates, and improve application latency by employing one of the techniques in distributed systems: “fail fast, retry soon”.

19 minutes
Register for access to all 60+ sessions available on demand.
Fill out the form to watch this session from the P99 CONF 2024 livestream. You’ll also get access to all available recordings.

Resources

This is inspired by a real-production use case where DynamoDB latency p99 & max went down from > 10s to ~500ms. AWS articles, specifically M. Brooker’s writings, and SDKs code have been great resources to dive into these techniques:

Omar Elgabry, Software Engineer at Square

A software engineer (B.S. CS & SWE, Jul '15), a writer, a teacher, a hackathon winner, with a polymorphic personality, born in Egypt, lived and worked in India, Turkey, and currently Canada.