Low-Latency Engineering Tech Talks

Browse the full library of P99 CONF tech talks and decks. Discover how experts tackle low-latency, high-performance distributed computing challenges from a wide range of perspectives.

Filter Videos

Cache Me If You Can: How Grafana Labs Scaled Up Their Memcached 42x & Cut Costs Too

Danny Kopping

Senior Software Engineer at Grafana Labs

Our cloud database stores billions of files in object storage. With petabytes of data being queried every day, we started…

High Performance on a Low Budget

Gwen Shapira

Co-founder & CPO of Nile

It is one thing to solve performance challenges when you have plenty of time, money, and expertise available. Many performance…

From 1M to 1B Features Per Second: Scaling ShareChat’s ML Feature Store

Andrei Manakov

Staff Software Engineer at ShareChat

Ivan Burmistrov

Senior Staff Software Engineer at ShareChat

ShareChat’s Ivan Burmistrov and Andrei Manakov walk through how they built a low latency ML Feature Store based on ScyllaDB which…

Corporate Open Source Anti-Patterns: A Decade Later

Bryan Cantrill

CTO of Oxide Computer Company

A little over a decade ago, I gave a talk on corporate open source anti-patterns, vowing that I would return…

Quantifying the Performance Impact of Shard-per-core Architecture

Dor Laor

CEO of ScyllaDB

Most software isn’t architected to take advantage of modern hardware. How does a shard-per-code and shared-nothing architecture help – and…

How Netflix Builds High Performance Applications at Global Scale

Prasanna Vijayanathan

Senior Software Engineer at Netflix

We all want to build applications that are blazingly fast. We also want to scale them to users all over…

eBPF vs Sidecars

Liz Rice

Chief Open Source Officer at Isovalent

From its vantage point in the kernel, eBPF provides a platform for building a new generation of infrastructure tools for…

Taming P99 Latencies at Lyft: Tuning Low-Latency Online Feature Stores

Bhanu Renukuntla

Senior Software Engineer at Lyft

In this talk, we will explore the challenges and strategies of tuning low latency online feature stores to tame the…

Running a Go App in Kubernetes: CPU Impacts

Teiva Harsanyi

Senior Software Engineer at Google

Understanding the impacts of running a containerized Go application inside Kubernetes with a focus on the CPU.

Expanding Horizons: A Case for Rust Higher Up the Stack

Carl Lerche

Principal Engineer at AWS

Historically associated with systems programming due to its roots in Mozilla, Rust’s promise of safety, speed, and concurrency has led…

How to Improve Your Ability to Solve Complex Performance Problems

Kerry Osborne

Google Database Black Belt Team Lead at Google

This talk is really about problem solving. It’s about how we think about problems and how we resolve those problems…

Square’s Lessons Learned from Implementing a Key-Value Store with Raft

Omar Elgabry

Software Engineer at Square

To put it simply, Raft is used to make a use case (e.g., key-value store, indexing system) more fault tolerant…

Performance Budgets for the Real World

Tammy Everts

Chief Experience Officer at SpeedCurve

Performance budgets have been around for more than ten years. Over those years, we’ve learned a lot about what works,…

A Deterministic Walk Down TigerBeetle’s main() Street

Aleksei Kladov

Staff Software Engineer at TigerBeetle

Learn how to use Zig to implement a fully deterministic distributed system which will never fail with an out of…

VM Performance: The Differences Between Static Partitioning or Automatic Tuning

Dario Faggioli

Virtualization Software Engineer at SUSE

Virtualized workloads are known to require carefully crafted configuration and tuning, both at the host and at the guest level,…

Measuring the Impact of Network Latency at Twitter

Widya Salim

Data Scientist at SEEK

Victor Ma

Senior Data Scientist at Airwallex

Zhen Li

Data Scientist at TikTok

Widya Salim, Victor Ma, and Zhen Li will outline the causal impact analysis, framework, and key learnings used to quantify…

Conquering Load Balancing: Experiences from ScyllaDB Drivers

Piotr Grabowski

Software Team Leader at ScyllaDB

Load balancing seems simple on the surface, with algorithms like round-robin, but the real world loves throwing curveballs. Join me…

Low-Latency Data Access: The Required Synergy Between Memory & Disk

Kriti Kathuria

Graduate Researcher at the University of Waterloo

Analytics has moved from internal dashboards to a dashboard inside the product, providing a personalized experience for each user, be…

Distributed System Performance Troubleshooting Like You’ve Been Doing it for Twenty Years

Jon Haddad

Founder at Rustyrazorblade Consulting

Troubleshooting performance issues across distributed systems can be intimidating if you don’t know where to start, and it’s even harder…

Writing Low Latency Database Applications Even If Your Code Sucks

Glauber Costa

Founder & CEO of Turso

All latency lovers are used to the mystical experience of joy coming from aligning data to a cache line size…

Using Libtracecmd to Analyze Your Latency and Performance Troubles

Steven Rostedt

Software Engineer at Google

Trying to figure out why your application is responding late can be difficult, especially if it is because of interference…

Building Low Latency ML Systems for Real-Time Model Predictions at Xandr

Chinmay Abhay Nerurkar

Principal Engineer at Microsoft

Xandr’s Ad-server handles over 400 billion daily ad requests from across the world wide web. Operating under a stringent Service…

ORM is Bad, But is There an Alternative?

Henrietta Dombrovskaya

Database Architect at DRW

It’s a well-known fact, that although the database performance is great, and each query is executed in milliseconds, the overall…

P99 Publish Performance in a Multi-Cloud NATS.io System

Derek Collison

Founder & CEO of Synadia

This talk will walk through the strategies and improvements made to the NATS server to accomplish P99 goals for persistent…

Making Python 100x Faster with Less Than 100 Lines of Rust

Ohad Ravid

Team Lead at Trigo

Python isn’t known as a low-latency language. Can we bridge the performance gap using a bit of Rust and some…

Zero Downtime Critical Traffic Migration @Netflix Scale

Abhishek Pandey

Senior Software Engineer at Meta

Picture yourself enthralled by the latest episode of your beloved Netflix series, delighting in an uninterrupted, high-definition streaming experience. Behind…

The History of Tracing Oracle

Cary Millsap

Owner and President of Method R Corporation

In this presentation, I will explore the history of tracing Oracle and why it has been overlooked despite its usefulness.…

Always-on Profiling of All Linux Threads, On-CPU and Off-CPU, with eBPF & Context Enrichment

Tanel Poder

Owner at Poder Consulting

In this session, Tanel introduces a new open source eBPF tool for efficiently sampling both on-CPU events and off-CPU events…

Cost-Effective Burst Scaling For Distributed Query Execution

Dan Harris

Principal Software Engineer at Coralogix

Building a query engine that scales efficiently is a difficult task. Queries over big datasets stored in Object Storage require…

Unconventional Methods to Identify Bottlenecks in Low-Latency and High-Throughput Data Pipelines

Zamir Paltiel

Head of Engineering at Hyperspace

In this presentation, we explore how standard profiling and monitoring methods may fall short in identifying bottlenecks in low-latency data…

Mitigating the Impact of State Management in Cloud Stream Processing Systems

Yingjun Wu

CEO of RisingWave Labs

Stream processing is a crucial component of modern data infrastructure, but constructing an efficient and scalable stream processing system can…

Practical Go Memory Profiling

William Kennedy

Managing Partner at Ardan Labs

In this talk, Bill will show you how to use benchmark profiling in and compiler directives in Go to find…

Adventures in Thread-per-Core Async with Redpanda and Seastar

Travis Downs

Software Engineer at Redpanda

Thread-per-core programming models are well known in software domains where latency is important. Pinning application threads to physical cores and…

Architecting a High-Performance (Open Source) Distributed Message Queuing System in C++

Vitaly Dzhitenov

Senior Software Engineer at Bloomberg

BlazingMQ is a new open source* distributed message queuing system developed at and published by Bloomberg. It provides highly-performant queues…

Noise Canceling RUM

Tim Vereecke

Web Performance Architect at Akamai

Noisy Real User Monitoring (RUM) data can ruin your P99! We introduce a fresh concept called “Human Visible Navigations” (HVN)…

Less Wasm

Piotr Sarna

Staff Software Engineer at Turso

The presentation explains why getting rid of WebAssembly is good for your latency. More specifically, it’s a short case study…

Reducing P99 Latencies with Generational ZGC

Stefan Johansson

Principle Member of Technical Staff at Oracle

With the low-latency garbage collector ZGC, GC pause times are no longer a big problem in Java. With sub-millisecond pause…

5 Hours to 7.7 Seconds: How Database Tricks Sped up Rust Linting Over 2000X

Predrag Gruevski

Independent Software Researcher at Trustfall

Linters are a type of database! They are a collection of lint rules — queries that look for rule violations…

Interaction Latency: Square’s User-Centric Mobile Performance Metric

Pierre-Yves Ricau

Android Distinguished Engineer at Block

Mobile performance metrics often take inspiration from the backend world and measure resource usage (CPU usage, memory usage, etc) and…

Chihuahua-Sized Load Tests!

Leandro Melendez

DevRel Performance Advocate at Grafana k6

Because bigger isn’t always better. Especially nowadays.Do your teams need help accommodating those humongous load tests in your agile &…

How to Avoid Learning the Linux-Kernel Memory Model

Paul McKenney

Software Engineer at Meta

The Linux-kernel memory model (LKMM) is a powerful tool for developing highly concurrent Linux-kernel code, but it also has a…

MySQL Performance on Modern CPUs: Intel vs AMD vs ARM

Peter Zaitsev

Founder of Percona

For years CPU choice for MySQL was pretty boring – just chose what Intel Made CPU you want. In recent…

How We Reduced the Startup Time for Turo’s Android App by 77%

Pavlo Stavytskyi

Sr. Staff Software Engineer at Turo

The startup time of a mobile app is one of the most important indicators of its performance and has a…

99.99% of Your Traces are Trash

Paige Cruz

Senior Developer Advocate at Chronosphere

Distributed tracing is still finding its footing in many organizations today, one challenge to overcome is the data volume –…

High-Level Rust for Backend Programming

Adam Chalmers

Systems Engineer at KittyCAD, Inc.

Some people say you should only use Rust where you can’t afford to use garbage collection. I disagree — Rust…

A Deep Dive Into Concurrent React

Matheus Albuquerque

Senior Software Engineer, Front-End at Medallia

Writing fluid user interfaces becomes more and more challenging as the application complexity increases. In this talk, we’ll explore how…

Ingesting in Rust

Armin Ronacher

Creator of Flask and Principal Architect at Sentry

At Sentry we handle hundreds of thousands of events a second — from tiny metric to huge memory dump. What…

The Latency Stack: Discovering Surprising Sources of Latency

Mark Gritter

Principal Engineer at Postman

Usually, when an API call is slow, developers blame ourselves and our code. We held a lock too long, or…

Building a 10x More Efficient Edge Platform

Felipe Huici

CEO and Co-Founder of Unikraft UG

Painful cold boots, terrible auto-scale times, minutes-long waits for compute nodes to be up: these are standard headaches that cloud…

Beyond Availability: The Seven Dimensions for Data Product SLOs

Emily Gorcenski

Principal Data Scientist at Thoughtworks

In the software world, we’re used to SLOs built around latency and availability. But in the data engineering universe, there…

Peak Performance at the Edge: Running Razorpay’s High-Scale API Gateway

Jay Pathak

Software Development Engineer at Razorpay

Razorpay caters to millions of API requests every day that are non-uniform in nature. As a key provider of financial…

Segment-Based Storage vs. Partition-Based Storage: Which is Better for Real-Time Data Streaming?

David Kjerrumgaard

Developer Advocate at StreamNative

Storage is a critical component of any real-time data streaming system, and the choice of storage model can significantly affect…

HTTP 3: Moving on From TCP

Brian Sletten

President at Bosatsu Consulting, Inc.

Any network class you have taken in the last thirty years will have highlighted that the application layer depends on…

Demanding the Impossible: Rigorous Database Benchmarking

Dmitrii Dolgov

Senior Software Engineer at Red Hat

It’s easy to conduct a misleading benchmark, and notoriously hard to design a correct and rigorous enough one. Have you…

The Art of Macro Benchmarking: Evaluating Cloud Native Services Efficiency

Bartłomiej Płotka

Senior Software Engineer at Google

Benchmarking is hard, especially on a macro level that integrates multiple code components into one or multiple microservices. It’s challenging…

The Art of Event Driven Observability with OpenTelemetry

Henrik Rexed

Cloud Native Advocate at Dynatrace

Explore the various components of OpenTelemetry, examples of unuseful traces from event driven architecture, and the purpose/usage of span links…