Low-Latency Engineering Tech Talks

Browse the full library of P99 CONF tech talks and decks. Discover how experts tackle low-latency, high-performance distributed computing challenges from a wide range of perspectives

Filter Videos

Browse our library of talks on low-latency engineering strategies.

Patterns of Low Latency

Pekka Enberg

Founder & CTO at Turso

Building for low latency is important, but the tips and tricks are often part of developer folklore and hard to…

DTrace at 21: Reflections on Fully-grown Software

Bryan Cantrill

CTO of Oxide Computer Company

Twenty one years ago, DTrace was integrated into the operating system. My any measure, the software is now fully-grown: it…

Rust + io_uring + ktls: How Fast Can We Make HTTP?

Amos Wenger

Creator of Faster Than Lime

Working on Fluke: async Rust HTTP1+2 with io_uring & kTLS, sponsored by fly.io & Shopify. Unlike others, Fluke is built…

The Next Chapter in the Sordid Love/Hate Relationship Between DBs and OSes

Andy Pavlo

Associate Professor at Carnegie Mellon University

DBMSs struggle with OS constraints, but new tech like eBPF can change the game. Join us to explore “user-bypass” designs…

Zero-overhead Container Networking with eBPF and Netkit

Liz Rice

Chief Open Source Officer, Isovalent at Cisco

Introducing Netkit: a new eBPF enhancement replacing veth connections in container networking. Say goodbye to the overhead slowing down container…

Noisy Neighbor Detection with eBPF

Jose Fernandez

Senior Software Engineer at Netflix

Tackling “noisy neighbor” issues in multi-tenant setups! At Netflix, we use eBPF to monitor and mitigate excessive CPU usage in…

Rust: A Productive Language for Writing Database Applications

Carl Lerche

Principal Engineer at AWS

Think Rust is just about performance and safety? Let’s talk productivity. Last year, Rust’s library ecosystem needed work. What’s changed?…

Designing a Query Queue for ScyllaDB

Avi Kivity

CTO and Co-Founder of ScyllaDB

Database queries vary widely—from milliseconds to hours. Optimizing concurrency is a delicate balance of CPU, memory, and stability. Bad design…

You’re Doing It All Wrong

Michael Stonebraker

CTO & Co-founder of DBOS

Historically, business apps use a three-tier architecture. Now, cloud-native architectures and DBMS can be combined, allowing for resilient, cost-effective, and…

1BRC – Nerd Sniping the Java Community

Gunnar Morling

Principal Software Engineer at Decodable

Gunnar Morling dives into the tricks that the fastest 1BRC solutions used to process the challenge’s 13 GB input file…

Overcoming Distributed Databases Scaling Challenges with Tablets

Dor Laor

CEO of ScyllaDB

Maximizing performance goes beyond server-level tweaks. Even low level code, scaling requires more. In this session, learn about “tablets”—a dynamic…

The Performance Engineer’s Toolkit: A Case Study on Data Analytics with Rust

Will Crichton

Assistant Professor at Brown University

I optimized a Python data analytics pipeline, making it 180,000x faster with Rust! Using compiler optimizations, data structures, vectorization, parallelization,…

Using Sketching Technology to Optimize Services with Fewer Resources

Yichen Wei

Engineer Manager at Disney+/Hulu

Optimize your services with cost-efficient observability using high-performance sketching tools. Dive into creating sketching tech for various scenarios, making the…

Using eBPF Off-CPU Sampling to See What Your DBs are Really Waiting For

Tanel Poder

Performance Nerd at PoderC LLC

At last year’s P99 CONF, Tanel introduced using eBPF Task State Arrays to track Linux apps’ thread states/activity without built-in…

Java Heap Memory Optimization to Improve P99 Query Latency at Linkedin Scale

Vivek Iyer Vaidyanathan

Staff Software Engineer at LinkedIn

Discover how LinkedIn optimized Apache Pinot’s performance! By using FALF Interning, a home-grown, lock-free method, they cut JVM heap usage…

Just In Time LSM Compaction

Aleksei Kladov

Staff Software Engineer at TigerBeetle

Matklad dives into the implementation of TigerBeetle’s JIT compaction algorithm for LSM, which is highly concurrent and uses all available…

Redis Alternatives Compared

Peter Zaitsev

Founder of Percona, Coroot, FerretDB

Join Peter as he dives into Redis alternatives like Valley, DragonflyDB, and Microsoft Garnet. He’ll cover licensing, features, community support,…

Detecting Memory Leaks in Android A/B Tests: A Production-Focused Approach

Pavlo Stavytskyi

Google Developer Expert

Discover how to detect subtle memory leaks and regressions in Android apps with a production-focused approach. Learn the key metrics…

One Billion Row Challenge in Golang

Shraddha Agrawal

Senior Software Engineer, Ceph, IBM

Join us as we tackle Gunnar Morling’s One Billion Rows Challenge in Golang! We’ll walk through optimizing a 16GB file…

Taming Discard Latency Spikes

Patryk Wróbel

Software Engineer at ScyllaDB

Learned a crucial lesson on read/write latency when fixing a real ScyllaDB issue! Discover how TRIM requests impact NVMe SSDs…

Why Databases Cache, but Caches Go to Disk

Felipe Cardeneti Mendes

Technical Director at ScyllaDB

Alan Kasindorf

Founder of Cache Forge

ScyllaDB teamed up with Memcached to compare how caches and databases handle storage and memory across different scenarios. We’ll dive…

Primitive Pursuits: Slaying Latency with Low-Level Primitives and Instructions

Ravi A Giri

Senior Principal Engineer at Intel

Harshad S Sane

Principal Software Engineer at Intel

This talk showcases a methodology with examples to break down applications to low-level primitives and identify optimizations on existing compute…

How to Improve Your Ability to Solve Complex Performance Problems: Part 2

Kerry Osborne

Google Database Black Belt Team Lead at Google

In Part 2 of my P99 2023 talk, I’ll dive into practical strategies to enhance our problem-solving skills in the…

Database Drivers: Performance Perspectives

Piotr Sarna

Founding Engineer at poolside

Unlock the full potential of database drivers! Dive deep into their design, uncover how they work under the hood, and…

Low-Latency Mesh Services Using Actors

Nikita Lapkov

Senior Software Engineer

We’re transforming elfo, our Rust actor system, into a distributed mesh of services. Learn how we tackled message serialization, compression,…

Minimizing Request Latency of Self-Hosted ML Models

Julia Kroll

Applied Engineer at Deepgram

Join our session on minimizing latency in self-hosted #ML models in cloud environments. Learn strategies for deploying Deepgram’s speech-to-text models…

Using Change Point Detection to Fight Noisy Benchmark Results

Matt Fleming

Co-Founer & CTO at Nyrkiö Oy

Discovering performance regressions in modern systems is tough due to inevitable noise. Change Point Detection (CPD) algorithms are gaining traction…

Enhancing P99 Latency: Strategies for Doubling/Tripling Performance in Third-Party APIs

Cristian Velazquez

Staff Site Reliability Engineer at Uber

Sharing our journey to improve P99 latency in third-party APIs. From optimizing network configs to fine-tuning connection management, we aimed…

Understanding Request Latency with Wallclock Profiling

Richard Startin

Senior Software Engineer at Datadog

Analyzing request latency is tough since it’s not always CPU-bound. Many devs give up on CPU profiling, but sampling profilers…

Fast, Secure and Dense: Finally Serverless with WebAssembly

Thorsten Hans

Sr. Cloud Advocate at Fermyon Technologies

Discover how WebAssembly is revolutionizing cloud computing. Join Thorsten Hans to learn about building serverless apps with Spin, achieving true…

Latency, Throughput & Fault Tolerance: Designing the Arroyo Streaming Engine

Micah Wylde

Co-Founder at Arroyo

Arroyo is a Rust-based, distributed stream processing engine offering millisecond-latency and high-throughput. It achieves fault tolerance and exactly-once processing via…

Get Low (Latency)

Benjamin Cane

Distinguished Engineer at American Express

Tyler Wedin

Vice President, Global Payments Network SRE at American Express

Building a real-time, low-latency card payments system is a challenge. Join the Amex Payments Network team to learn about their…

Reliable Data Replication

Cameron Morgan

Staff Infrastructure Engineer at Shopify

Data replication ensures high availability—reliable, consistent, and timely access. Dive into the tough problems often skipped: reliable backfills, schema changes,…

Scheduler Tracing With ftrace + eBPF

Jason Rahman

Principal Software Engineer at Microsoft

Dive into understanding app latency by exploring the Linux scheduler with ftrace, eBPF, and Perfetto for visualization. Uncover quirks in…

Aiding the CUDA Compiler for Fun and Profit

Joe Rowell

Founding Engineer at poolside

Get the most out of your CUDA code by understanding how the compiler works.

Building a Cloud Native LSM on Object Storage

Chris Riccomini

Creator of Materialized View

Rohan Desai

Co-Founder of Responsive

Excited to introduce SlateDB, an open-source, cloud-native storage engine. Built as an LSM on object stores like S3/GCS/ABS, it leverages…

Cheating the Cloud: 50% Savings with Compression Dictionaries

Łukasz Paszkowsk

Software Engineer Team Lead at ScyllaDB

Faced with high networking costs, we tackled insufficient compression with a custom RPC compressor using ZSTD and external dictionary support.…

Internet-Scale Semantic, Structural, and Text Search in Real Time

Ash Vardanian

Founder of Unum Cloud

Discover powerful search algorithms and their SIMD- and GPU-accelerated implementations for AI-powered semantic search, structure search, or exact & fuzzy…

Writing a Kernel in Rust: Code Quality and Performance

Luc Lenôtre

Site Reliability Engineer at Clever Cloud

Maestro kernel began as a C-based school project and transitioned to Rust for better code quality. Now, it’s in a…

Running Low-Latency Workloads on Kubernetes

Jimmy Zelinskie

Co-Founder of AuthZed

Configuring Kubernetes for optimal workload performance is a continuous journey. Best practices can sometimes harm performance. Join us as we…

Distributed Async Await: A New Programming Model for the Cloud

Dominik Tornow

CEO at Resonate HQ

Dive into the future of cloud dev with Distributed Async Await. Simplify your code and conquer the chaos of distributed…

Feature Store Evolution Under Cost Constraints: When Cost is Part of the Architecture

David Malinge

Senior Staff Software Engineer at ShareChat

Ivan Burmistrov

Principal Software Engineer at ShareChat

ShareChat’s scaling ML Feature Store to handle 1B features/sec was just the start. Next challenge: cutting costs while keeping quality.…

WebAssembly on the Edge: Sandboxing AND Performance

Brian Sletten

President at Bosatsu Consulting, Inc.

Ramnivas Laddad

Co-Founder of Exograph, Inc

Moving apps to the Edge can complicate performance due to security constraints. Learn how WebAssembly bridges the gap, enabling both…

Queues, Hockey Sticks and Performance

David Collier-Brown

Staff Engineer

Queues: both a blessing and a curse in computer science. They help predict performance but also signal overload. This talk…

Taming Tail Latencies in Apache Pinot with Generational ZGC

Christopher Peck

Senior Software Engineer at Uber

Discover how Generational ZGC slashed Java app pause times in real-world use! Learn how Apache Pinot tackled scatter-gather tail latencies…

Measuring and Diagnosing Performance Shouldn’t Require Magic

Cary Millsap

Distinguished Product Manager at Oracle

Struggling with performance issues despite all green dashboards? Experts say you need special skills, but we’ll show you how to…

Remote CAD that Feels Local

Adam Chalmers

Systems Engineer at Zoo

Adam Sunderland

Lead Cloud Infrastructure Engineer at Zoo

Zoo is creating a CAD suite that runs in the cloud but feels like it’s local. How? Regional deployment, WebRTC…

Profiling your Go Service with pprof

Miriah Peterson

Lead Engineer at Soypete Tech

Optimize your Go code with the powerful pprof tool. Learn how to integrate, access, and interpret pprof metrics, plus best…

Performance Pitfalls of Rust Async Function Pointers (And Why It Might Not Matter)

Byron Wasti

Founder & CEO

An in-depth analysis of asynchronous function pointers in Rust, why they aren’t a real thing (compared to normal function pointers)…

Elevating PostgreSQL: Benchmarking Vector Search Performance

Daniel Seybold

Co-Founder at benchANT

PostgreSQL continues to evolve with vector search extensions like pgvector and pgvecto.rs. We’ll explore recent benchmarks comparing vector search performance…

Sight Beyond Sight: See it All Through Observability

Leandro Melendez

Developer Advocate at Grafana Labs

Observability is more than metrics and logs—it’s knowing your system’s status without checking under the hood. From QA processes to…

Time-Series and Analytical Databases Walk Into a Bar…

Andrei Pechkurov

Core Engineer at QuestDB

In this talk, we share our journey in making QuestDB, an open-source time-series database, a much faster analytical database, featuring…

Profile-Guided Optimization (PGO): (Ab)using it for Fun and Profit

Aliaksandr Zaitsau

Solution Architect

Discover how to boost your software with lesser-known compiler flags and Profile-Guided Optimization (PGO). Learn what PGO is, how it…

How a Failed Experiment Helped Me Understand the Go Runtime in More Depth

Aadhav Vignesh

Software Engineer

In 2022, I began crafting a tool to visualize Go’s GC in real-time. I’ll dive into the hurdles of extracting…

What C and C++ Can Do and When Do You Need Assembly?

Alexander Krizhanovsky

CEO at Tempesta Technologies

Join us to dive into GCC and Clang optimizations for C/C++! We’ll explore how x86-64 executes code, use assembly for…

Low Latency Gal Presents: Low Latency Stuff

Sonia Kolasinska

Low Latency Gal

Lock-free programming and precise ultra low latency pipelining between CPU cores.

Cache Me If You Can: How Grafana Labs Scaled Up Their Memcached 42x & Cut Costs Too

Danny Kopping

Senior Software Engineer at Grafana Labs

Our cloud database stores billions of files in object storage. With petabytes of data being queried every day, we started…

High Performance on a Low Budget

Gwen Shapira

Co-founder & CPO of Nile

It is one thing to solve performance challenges when you have plenty of time, money, and expertise available. Many performance…

From 1M to 1B Features Per Second: Scaling ShareChat’s ML Feature Store

Andrei Manakov

Senior Staff Software Engineer at ShareChat

Ivan Burmistrov

Principal Software Engineer at ShareChat

ShareChat’s Ivan Burmistrov and Andrei Manakov walk through how they built a low latency ML Feature Store based on ScyllaDB which…

Corporate Open Source Anti-Patterns: A Decade Later

Bryan Cantrill

CTO of Oxide Computer Company

A little over a decade ago, I gave a talk on corporate open source anti-patterns, vowing that I would return…

Quantifying the Performance Impact of Shard-per-core Architecture

Dor Laor

CEO of ScyllaDB

Most software isn’t architected to take advantage of modern hardware. How does a shard-per-code and shared-nothing architecture help – and…

How Netflix Builds High Performance Applications at Global Scale

Prasanna Vijayanathan

Senior Software Engineer at Netflix

We all want to build applications that are blazingly fast. We also want to scale them to users all over…

eBPF vs Sidecars

Liz Rice

Chief Open Source Officer, Isovalent at Cisco

From its vantage point in the kernel, eBPF provides a platform for building a new generation of infrastructure tools for…

Taming P99 Latencies at Lyft: Tuning Low-Latency Online Feature Stores

Bhanu Renukuntla

Senior Software Engineer at Lyft

In this talk, we will explore the challenges and strategies of tuning low latency online feature stores to tame the…

Running a Go App in Kubernetes: CPU Impacts

Teiva Harsanyi

Senior Software Engineer at Google

Understanding the impacts of running a containerized Go application inside Kubernetes with a focus on the CPU.

Expanding Horizons: A Case for Rust Higher Up the Stack

Carl Lerche

Principal Engineer at AWS

Historically associated with systems programming due to its roots in Mozilla, Rust’s promise of safety, speed, and concurrency has led…

How to Improve Your Ability to Solve Complex Performance Problems

Kerry Osborne

Google Database Black Belt Team Lead at Google

This talk is really about problem solving. It’s about how we think about problems and how we resolve those problems…

Square’s Lessons Learned from Implementing a Key-Value Store with Raft

Omar Elgabry

Software Engineer at Square

To put it simply, Raft is used to make a use case (e.g., key-value store, indexing system) more fault tolerant…

Performance Budgets for the Real World

Tammy Everts

Chief Experience Officer at SpeedCurve

Performance budgets have been around for more than ten years. Over those years, we’ve learned a lot about what works,…

A Deterministic Walk Down TigerBeetle’s main() Street

Aleksei Kladov

Staff Software Engineer at TigerBeetle

Learn how to use Zig to implement a fully deterministic distributed system which will never fail with an out of…

VM Performance: The Differences Between Static Partitioning or Automatic Tuning

Dario Faggioli

Virtualization Software Engineer at SUSE

Virtualized workloads are known to require carefully crafted configuration and tuning, both at the host and at the guest level,…

Measuring the Impact of Network Latency at Twitter

Widya Salim

Data Scientist at SEEK

Victor Ma

Senior Data Scientist at Airwallex

Zhen Li

Data Scientist at TikTok

Widya Salim, Victor Ma, and Zhen Li will outline the causal impact analysis, framework, and key learnings used to quantify…

Conquering Load Balancing: Experiences from ScyllaDB Drivers

Piotr Grabowski

Software Team Leader at ScyllaDB

Load balancing seems simple on the surface, with algorithms like round-robin, but the real world loves throwing curveballs. Join me…

Low-Latency Data Access: The Required Synergy Between Memory & Disk

Kriti Kathuria

Graduate Researcher at the University of Waterloo

Analytics has moved from internal dashboards to a dashboard inside the product, providing a personalized experience for each user, be…

Distributed System Performance Troubleshooting Like You’ve Been Doing it for Twenty Years

Jon Haddad

Founder at Rustyrazorblade Consulting

Troubleshooting performance issues across distributed systems can be intimidating if you don’t know where to start, and it’s even harder…

Writing Low Latency Database Applications Even If Your Code Sucks

Glauber Costa

Founder & CEO of Turso

All latency lovers are used to the mystical experience of joy coming from aligning data to a cache line size…

Using Libtracecmd to Analyze Your Latency and Performance Troubles

Steven Rostedt

Software Engineer at Google

Trying to figure out why your application is responding late can be difficult, especially if it is because of interference…

Building Low Latency ML Systems for Real-Time Model Predictions at Xandr

Chinmay Abhay Nerurkar

Principal Engineer at Microsoft

Xandr’s Ad-server handles over 400 billion daily ad requests from across the world wide web. Operating under a stringent Service…

ORM is Bad, But is There an Alternative?

Henrietta Dombrovskaya

Database Architect at DRW

It’s a well-known fact, that although the database performance is great, and each query is executed in milliseconds, the overall…

P99 Publish Performance in a Multi-Cloud NATS.io System

Derek Collison

Founder & CEO of Synadia

This talk will walk through the strategies and improvements made to the NATS server to accomplish P99 goals for persistent…

Making Python 100x Faster with Less Than 100 Lines of Rust

Ohad Ravid

Team Lead at Trigo

Python isn’t known as a low-latency language. Can we bridge the performance gap using a bit of Rust and some…

Zero Downtime Critical Traffic Migration @Netflix Scale

Abhishek Pandey

Senior Software Engineer at Meta

Picture yourself enthralled by the latest episode of your beloved Netflix series, delighting in an uninterrupted, high-definition streaming experience. Behind…

The History of Tracing Oracle

Cary Millsap

Distinguished Product Manager at Oracle

In this presentation, I will explore the history of tracing Oracle and why it has been overlooked despite its usefulness.…

Always-on Profiling of All Linux Threads, On-CPU and Off-CPU, with eBPF & Context Enrichment

Tanel Poder

Performance Nerd at PoderC LLC

In this session, Tanel introduces a new open source eBPF tool for efficiently sampling both on-CPU events and off-CPU events…

Cost-Effective Burst Scaling For Distributed Query Execution

Dan Harris

Principal Software Engineer at Coralogix

Building a query engine that scales efficiently is a difficult task. Queries over big datasets stored in Object Storage require…

Unconventional Methods to Identify Bottlenecks in Low-Latency and High-Throughput Data Pipelines

Zamir Paltiel

Head of Engineering at Hyperspace

In this presentation, we explore how standard profiling and monitoring methods may fall short in identifying bottlenecks in low-latency data…

Mitigating the Impact of State Management in Cloud Stream Processing Systems

Yingjun Wu

CEO of RisingWave Labs

Stream processing is a crucial component of modern data infrastructure, but constructing an efficient and scalable stream processing system can…

Practical Go Memory Profiling

William Kennedy

Managing Partner at Ardan Labs

In this talk, Bill will show you how to use benchmark profiling in and compiler directives in Go to find…

Adventures in Thread-per-Core Async with Redpanda and Seastar

Travis Downs

Software Engineer at Redpanda

Thread-per-core programming models are well known in software domains where latency is important. Pinning application threads to physical cores and…

Architecting a High-Performance (Open Source) Distributed Message Queuing System in C++

Vitaly Dzhitenov

Senior Software Engineer at Bloomberg

BlazingMQ is a new open source* distributed message queuing system developed at and published by Bloomberg. It provides highly-performant queues…

Noise Canceling RUM

Tim Vereecke

Web Performance Architect at Akamai

Noisy Real User Monitoring (RUM) data can ruin your P99! We introduce a fresh concept called “Human Visible Navigations” (HVN)…

Less Wasm

Piotr Sarna

Founding Engineer at poolside

The presentation explains why getting rid of WebAssembly is good for your latency. More specifically, it’s a short case study…

Reducing P99 Latencies with Generational ZGC

Stefan Johansson

Principle Member of Technical Staff at Oracle

With the low-latency garbage collector ZGC, GC pause times are no longer a big problem in Java. With sub-millisecond pause…

5 Hours to 7.7 Seconds: How Database Tricks Sped up Rust Linting Over 2000X

Predrag Gruevski

Independent Software Researcher at Trustfall

Linters are a type of database! They are a collection of lint rules — queries that look for rule violations…

Interaction Latency: Square’s User-Centric Mobile Performance Metric

Pierre-Yves Ricau

Android Distinguished Engineer at Block

Mobile performance metrics often take inspiration from the backend world and measure resource usage (CPU usage, memory usage, etc) and…

Chihuahua-Sized Load Tests!

Leandro Melendez

Developer Advocate at Grafana Labs

Because bigger isn’t always better. Especially nowadays.Do your teams need help accommodating those humongous load tests in your agile &…

How to Avoid Learning the Linux-Kernel Memory Model

Paul McKenney

Software Engineer at Meta

The Linux-kernel memory model (LKMM) is a powerful tool for developing highly concurrent Linux-kernel code, but it also has a…

MySQL Performance on Modern CPUs: Intel vs AMD vs ARM

Peter Zaitsev

Founder of Percona, Coroot, FerretDB

For years CPU choice for MySQL was pretty boring – just chose what Intel Made CPU you want. In recent…

How We Reduced the Startup Time for Turo’s Android App by 77%

Pavlo Stavytskyi

Google Developer Expert

The startup time of a mobile app is one of the most important indicators of its performance and has a…

99.99% of Your Traces are Trash

Paige Cruz

Senior Developer Advocate at Chronosphere

Distributed tracing is still finding its footing in many organizations today, one challenge to overcome is the data volume –…

High-Level Rust for Backend Programming

Adam Chalmers

Systems Engineer at Zoo

Some people say you should only use Rust where you can’t afford to use garbage collection. I disagree — Rust…

A Deep Dive Into Concurrent React

Matheus Albuquerque

Senior Software Engineer, Front-End at Medallia

Writing fluid user interfaces becomes more and more challenging as the application complexity increases. In this talk, we’ll explore how…

Ingesting in Rust

Armin Ronacher

Creator of Flask and Principal Architect at Sentry

At Sentry we handle hundreds of thousands of events a second — from tiny metric to huge memory dump. What…

The Latency Stack: Discovering Surprising Sources of Latency

Mark Gritter

Principal Engineer at Postman

Usually, when an API call is slow, developers blame ourselves and our code. We held a lock too long, or…

Building a 10x More Efficient Edge Platform

Felipe Huici

CEO and Co-Founder of Unikraft UG

Painful cold boots, terrible auto-scale times, minutes-long waits for compute nodes to be up: these are standard headaches that cloud…

Beyond Availability: The Seven Dimensions for Data Product SLOs

Emily Gorcenski

Principal Data Scientist at Thoughtworks

In the software world, we’re used to SLOs built around latency and availability. But in the data engineering universe, there…

Peak Performance at the Edge: Running Razorpay’s High-Scale API Gateway

Jay Pathak

Software Development Engineer at Razorpay

Razorpay caters to millions of API requests every day that are non-uniform in nature. As a key provider of financial…

Segment-Based Storage vs. Partition-Based Storage: Which is Better for Real-Time Data Streaming?

David Kjerrumgaard

Developer Advocate at StreamNative

Storage is a critical component of any real-time data streaming system, and the choice of storage model can significantly affect…

HTTP 3: Moving on From TCP

Brian Sletten

President at Bosatsu Consulting, Inc.

Any network class you have taken in the last thirty years will have highlighted that the application layer depends on…

Demanding the Impossible: Rigorous Database Benchmarking

Dmitrii Dolgov

Senior Software Engineer at Red Hat

It’s easy to conduct a misleading benchmark, and notoriously hard to design a correct and rigorous enough one. Have you…

Misery Metrics & Consequences

Gil Tene

CTO and Co-Founder of Azul Systems

Join Azul System’s Gil Tene as he defines “misery metrics,” which describe what happens when our production systems are operating…

Sharpening the Axe: The Primacy of Toolmaking

Bryan Cantrill

CTO of Oxide Computer Company

Oxide’s Bryan Cantrill weighs in on allowing engineers to make their own tools, resulting in better systems delivered faster and…

The Art of Macro Benchmarking: Evaluating Cloud Native Services Efficiency

Bartłomiej Płotka

Senior Software Engineer at Google

Benchmarking is hard, especially on a macro level that integrates multiple code components into one or multiple microservices. It’s challenging…

The Art of Event Driven Observability with OpenTelemetry

Henrik Rexed

Cloud Native Advocate at Dynatrace

Explore the various components of OpenTelemetry, examples of unuseful traces from event driven architecture, and the purpose/usage of span links…

P99 Pursuit: 8 Years of Battling P99 Latency

Dor Laor

CEO of ScyllaDB

ScyllaDB CEO Dor Laor covers principles for successful OSS projects like ScyllaDB, KVM, the Linux kernel and why they spurred…

From SLO to GOTY

Charity Majors

CTO of Honeycomb

Charity Majors shares the performance lessons we can all learn from game developers, who were among the first to run…

Linux Kernel vs DPDK: HTTP Performance Showdown

Marc Richards

Performance Engineer at Amazon Web Services

AWS’ Marc Richards uses an HTTP benchmark to compare performance of the Linux kernel networking stack with userspace networking doing…

Overcoming Variable Payloads to Optimize for Performance

Armin Ronacher

Creator of Flask and Principal Architect at Sentry

Hear from Sentry’s Armin Ronacher, creator of the Flask framework for Python, on how to optimize for performance when you…

Using eBPF for High-Performance Networking in Cilium

Liz Rice

Chief Open Source Officer, Isovalent at Cisco

Isovalent’s Liz Rice shows how and why Cilium bypasses the kernel using eBPF for Kubernetes and container orchestration networking, observability…

High-speed Database Throughput Using Apache Arrow Flight SQL

Kyle Porter

Architect at Dremio

James Duong

Architect at Dremio

Kyle Porter and James Duong of Bit Quill Technologies share how Flight SQL can push SQL query throughput beyond existing…

Square Engineering’s “Fail Fast, Retry Soon” Performance Optimization Technique

Omar Elgabry

Software Engineer at Square

Learn how to build resilient systems, reduce failure rates, and improve application latency by employing one of the techniques in…

Clouds are Not Free: Guide to Observability-Driven Efficiency Optimizations

Bartłomiej Płotka

Senior Software Engineer at Google

Red Hat’s Bartłomiej Płotka explains how to find and uncover efficiency problems effectively using the power of modern cloud-native observability…

How a Database Looks from a Disk’s Perspective

Avi Kivity

CTO and Co-Founder of ScyllaDB

ScyllaDB’s CTO Avi Kivity dives into how high performance distributed systems such as modern databases can make best, most efficient…

Measuring the CPU Performance of Android Apps at Lyft

Pavlo Stavytskyi

Google Developer Expert

Hear from Pavlo Stavytskyi on how Lyft measures CPU load to improve app performance. What metrics they collect, plus how…

Speedup Your Code Through Asynchronous Programing

Sabina Smajlaj

Operations Developer at Hudson River Trading

Hudson River Trading’s Sabina Smajlaj demonstrates how to take advantage of programming languages’ asynchronous libraries with a few minor tweaks…

Analyze Virtual Machine Overhead Compared to Bare Metal with Tracing

Steven Rostedt

Software Engineer at Google

Google’s Steve Rostedt discusses using tracing to analyze when the overhead from a Linux host running KVM is higher than…

A New IO Scheduler Algorithm for Mixed Workloads

Pavel Emelyanov

Principal Software Engineer, ScyllaDB

Discover how ScyllaDB, built on the highly asynchronous Seastar library, implemented an IO scheduler optimized for peak performance on modern…

Large-Scale, Semi-Automated Go Garbage Collection Tuning at Uber

Cristian Velazquez

Staff Site Reliability Engineer at Uber

Uber’s Cristian Velazquez talks about tuning garbage collection for Go to scale applications across 70,00 cores to maintain 30 mission-critical…

Why User-Mode Threads Are Good for Performance

Ron Pressler

Project Loom Technical Lead, Java Platform Group at Oracle

Hear from Oracle’s Ron Pressler how Java added virtual threads, an implementation of user-mode threads, to help write high-throughput servers.

Hardware Assisted Latency Investigations

Kshitij Doshi

Senior Principal Engineer, Intel Corportation

Harshad S Sane

Principal Software Engineer at Intel

Intel’s Harshad S Sane & Kshitij Doshi share new ways to use eBPF to better examine latency excursions.

Continuous Performance from Load Testing to SRE and Beyond

Leandro Melendez

Developer Advocate at Grafana Labs

Grafana k6’s Leandro Melendez explores how to use continuous methodologies, service structures, microservice tiers, cloud, and elasticity.

The Observant Developer — Continuous Feedback with OpenTelemetry

Roni Dover

CTO of Digma

Roni Dover shares practical ways that OpenTelemetry combined with open-source tools can be integrated into the modern development stack.

End-To-End Performance Testing, Profiling, and Analysis at Redis

Filipe Oliveira

Principal Performance Engineer at Redis

Learn how Redis developed an automated framework for performance regression testing, telemetry gathering, profiling, and data visualization upon code commit.

Keeping Latency Low for User-Defined Functions with WebAssembly

Piotr Sarna

Founding Engineer at poolside

Piotr Sarna describes how to integrate WebAssembly and Wasmtime into a C++ project in a latency-friendly manner by implementing UDFs…

Evaluating Performance In Go

William Kennedy

Managing Partner at Ardan Labs

William Kennedy provides a deep dive training on how to optimize Go’s concurrency and garbage collection.

How We Reduced Performance Tuning Time by Orders of Magnitude with Database Observability

Yuying Song

Database Performance Engineer at PingCAP

PingCap’s Database Performance Engineer Yuying will share how to measure latency in a distributed system using a top-down (holistic) approach,…

Implementing Highly Performant Distributed Aggregates

Michal Jadwiszczak

Software Engineer at ScyllaDB

ScyllaDB’s Michał Jadwiszczak explains how can you implement aggregate functions without hammering real-time availability and performance for other read/write operations.

Ultra-Low-Latency Web Rendering on the Edge

Malte Ubl

Chief Architect at Vercel

Vercel’s Malte Ubl will discuss the trade-offs of the new paradigm of rendering web pages in the edge, and look…

A Deep Dive into Query Performance

Peter Zaitsev

Founder of Percona, Coroot, FerretDB

Percona’s Peter Zaitsev explores overlooked and underappreciated ways to successfully establish a connection and get results to the queries promptly…

How Dashtable Helps Dragonfly Maintain Low Latency

Roman Gershman

Co-Founder of DragonflyDB

Roman Gershman explains how Dragonfly’s hastable implementation helps to keep its tail latency in check — including a look at…

Fast and Fault Tolerant

Michael Barker

Independent Consultant at Ephemeris Consulting

Michael Barker draws on knowledge from working on financial exchanges, messaging and clustering systems to describe a model that can…

Taming Go’s Memory Usage — and Avoiding a Rust Rewrite

Mark Gritter

Principal Engineer at Postman

Akita’s Mark Gritter goes against the current trends and describes why he and his team stuck with Golang and chose…

Tracking Syscall and Function Latency in Your k8s Cluster with eBPF

Matthew Lenhard

CTO of ContainIQ

ContainIQ’s Matthew Lenhard walks the audience through a real life performance tuning exercise, where we hunt down slow system calls…

Outrageous Performance: RageDB’s Experience with the Seastar Framework

Max De Marzi Jr.

Developer at RageDB

Learn how RageDB leveraged the Seastar framework to build an outrageously fast graph database in this talk by Max De…

Pitfalls in Writing High-Performance Systems in Rust

Marek Galovic

Staff Software Engineer at Pinecone

Pinecone’s Marek Galovic looks at common and maybe not so common pitfalls in writing high-performance distributed systems in Rust.

Why Kubernetes Freedom Requires Chaos Engineering to Shine in Production

Henrik Rexed

Cloud Native Advocate at Dynatrace

Dynatrace’s Henrik Rexed uses production methods and Kubernetes settings useful to avoid outages, from chaos engineering, to observability and load…

Testing Persistent Storage Performance in Kubernetes with Sherlock

Sagy Volkov

Distinguished Performance Architect at Lightbits Labs

Lightbits Labs’ Sagy Volkov demonstrates how to use Sherlock, an open source platform written to test persistent NVMe/TCP storage in…

Aggregator Leaf Tailer: Bringing Data to Your Users with Ultra Low Latency

Jeffery Utter

Staff Software Developer at theScore

Discover how and why theScore built Datadex, an aggregator leaf tailer system built for geographically distributed, low-latency queries and real-time…

Properly Understanding Latency is Hard — What We Learned When We Did it Correctly

Brian Taylor

Principle Software Engineer at Optimizely

Optimizely’s Brian Taylor applies lessons of Gil Tene’s coordinated omission talk to understand the surprising sources of latency found in…

Measuring P99 Latency in Event-Driven Architectures with OpenTelemetry

Antón Rodríguez

Principal Software Engineer at New Relic

New Relic’s Antón Rodríguez shows how Event-Driven Architectures can instrument apps using vendor-neutral APIs, libraries, and tools via OpenTelemetry.

C# as a System Language

Oren Eini

Founder & CEO of RavenDB

RavenDB’s Oren Eini discusses the features that make C# a viable system language for building high-end systems.

Retaining Goodput with Query Rate Limiting

Piotr Dulikowski

Senior Software Engineer, ScyllaDB

ScyllaDB’s Piotr Dulikowski walks through how they tackled a “hot partition” problem: a single partition accessed with disproportionate frequency that…

Improving Performance of Micro-Frontend Applications through Error Monitoring

Garrett Hamelin

Developer Advocate at Airbrake, a LogicMonitor Company

Airbrake’s Garret Hamelin walks you through some of the dos and don’ts for trying to reduce errors and improve performance…

It’s Time to Debloat the Cloud with Unikraft

Felipe Huici

CEO and Co-Founder of Unikraft UG

Felipe Huici introduces Unikraft, a cloud operating system that allows for easily building fully-tailored cloud-ready images that boot in a…

Building Efficient Multi-Threaded Filters for Faster SQL Queries

Vlad Ilyushchenko

Co-Founder and CTO at QuestDB

QuestDB’s Vlad Ilyushchenko will describe how they optimized their database performance using efficient zero garbage collection multithreaded query processing.

Performance Insights Into eBPF, Step by Step

Dmitrii Dolgov

Senior Software Engineer at Red Hat

Red Hat’s Dmitri Dolgov sheds light on using eBPF. How to collect execution metrics, profile programs and common pitfalls to…

cachegrand: A Take on High Performance Caching

Daniele Salvatore Albano

Senior Software Engineer II at Microsoft

Microsoft’s Daniele Salvatore Albano presents cachegrand, a SIMD-accelerated hashtable without locks or busy-wait loops using fibers, io_uring, and much more.

Throw Away Your Nines

Alex Hidalgo

Principal Reliability Advocate at Nobl9

You may encounter problems if you only think about “nines” setting service reliability targets. Throw away your nines. Let’s find…

The Role of Machine Learning In Cloud Native Performance Optimization

Brian Likosar

Global Director of Solutions Architecture at StormForge

StormForge’s Brian Likosar shows how machine learning can be used to optimally configure apps deployed in Kubernetes to ensure performance…

Capturing NIC and Kernel TX and RX Timestamps for Packets in Go

Blain Smith

Staff Software Engineer at Rocket Science

Rocket Science’s Blain Smith shows how to get better timestamp granularity from the NIC by directly sending and capturing data…

Cutting Through the Fog of Virtualization

Bernd Bandemer

Head of Data Science at Clockwork Systems Inc.

Clockwork Systems’ Bernd Bandemer details causes of cloud network latency, from its underlying infrastructure, to its physical topology and network…

Optimizing Servers for High-Throughput and Low-Latency at Dropbox

Alexey Ivanov

Software Engineer at Dapper Labs

Dapper Labs’ Alexey Ivanov explores layers of efficiency/performance optimizations from hardware, drivers, Linux kernel, library and application-level tunings.

Removing Implicit Deadlocks on a Thread-per-core Architecture with 2-phase Processing

Alex Gallego

CEO and Founder of Redpanda

Redpanda’s Alex Gallego will show how implicit limitations in asynchronous programming can be addressed by a 2-phase technique for resolving…

Apache Iceberg: An Architectural Look Under the Covers

Alex Merced

Developer Advocate at Dremio

Alex Merced, Developer Advocate at Dremio, describes the open data lakehouse architecture and performance-oriented capabilities of Apache Iceberg.

Three Perspectives on Measuring Latency

Geoffrey Beausire

Senior Site Reliability Engineer at Criteo

Discover from Criteo’s Geoffrey Beausire how to measures latency in key-value infrastructure from both server and client sides, as well…

Continuous Performance Regression Testing with JfrUnit

Gunnar Morling

Principal Software Engineer at Decodable

Gunnar Morling (Red Hat) explains how to use JfrUnit to track metrics that could impact application performance.

Realtime Indexing for Fast Queries on Massive Semi-Structured Data

Dhruba Borthakur

CTO of Rockset

Dhruba Borthakur (Rockset) explains how to combine lightweight transactions with real-time analytics to power a user-facing application.

OSNoise Tracer: Who Is Stealing My CPU Time?

Daniel Bristot de Oliveira

Principal Software Engineer at Red Hat

Daniel Bristot de Oliveira (Red Hat) explores operating system noise (the interference experienced by an application due to activities inside…

OSv Unikernel — Optimizing Guest OS to Run Stateless and Serverless Apps in the Cloud

Waldek Kozaczuk

OSv Committer

Waldek Kozaczuk talks about optimizing a guest OS to run stateless and serverless apps in the cloud for CNN’s video…

New Ways to Find Latency in Linux Using Tracing

Steven Rostedt

Software Engineer at Google

Steven Rostedt dives into new flexible and dynamic aspects of ftrace that can help expose latency issues.

How to Measure Latency

Heinrich Hartmann

Principal Engineer at Zalando

Heinrich Hartmann (Zalando) shares strategies for avoiding pitfalls with collecting, aggregating and analyzing latency data for monitoring and benchmarking.

Rust Is Safe. But Is It Fast?

Glauber Costa

Founder & CEO of Turso

Glauber Costa outlines pitfalls and best practices for developing Rust applications with low P99.

G1: To Infinity and Beyond

Stefan Johansson

Principle Member of Technical Staff at Oracle

Stefan Johansson (Oracle) provides insights on the G1 JVM garbage collector — what’s new, how it impacts performance, and what’s…

I/O Rings and You — Optimizing I/O on Windows

Yarden Shafir

Software Engineer at Crowdstrike

Yarden Shafir (Crowdstrike) introduces Windows’ implementation of I/O rings, demonstrating how it’s used, and discusses potential future additions.

Data Structures for High Resolution, Real-time Telemetry at Scale

Filipe Oliveira

Performance Engineer at Redis

Felipe Oliveira (Redis) explains how to use several OSS data structures to incorporate telemetry features at scale… and why they…

Scaling Apache Pulsar to 10 Petabytes/Day

Karthik Ramasamy

Senior Director of Engineering at Splunk

Karthik Ramaswamy (Splunk) demonstrates how data — including logs and metrics — can be processed at scale and speed with…

RISC-V on Edge: Porting EVE and Alpine Linux to RISC-V

Kathy Giori

Ecosystem Engagement Lead at ZEDEDA

Roman Shaposhnik

Co-Founder of ZEDEDA Inc.

Roman and Kathy share their experience porting Alpine Linux and LF Edge EVE-OS to the new RISC-V architecture

Extreme HTTP Performance Tuning: 1.2M API req/s on a 4 vCPU EC2 Instance

Marc Richards

Performance Engineer at Talawah Solutions

Marc Richards shares the performance tuning steps that he took to serve 1.2M JSON requests per second from a 4…

Is It Faster to Go with Redpanda Transactions than Without Them?!

Denis Rystsov

Staff Engineer at Vectorized

Denis Rystsov shares how Redpanda optimized the Kafka API and pushed throughput of distributed transactions up to 8X beyond an…

Crimson: Ceph for the Age of NVMe and Persistent Memory

Orit Wasserman

Architect at Red Hat

Orit Wasserman (Red Hat) talks about implementing Seastar, a highly asynchronous engine as a new foundation for the Ceph distributed…

Performance Analysis and Troubleshooting Methodologies for Databases

Peter Zaitsev

CEO and Co-Founder of Percona

Peter Zaitsev (Percona) presents 3 performance analysis approaches + explained the best use cases for each.

Seastore: Next Generation Backing Store for Ceph

Sam Just

Senior Principal Software Engineer at Red Hat

Sam Just (Red Hat) shares how they architected their next-generation distributed file system to take advantage of emerging storage technologies…

Object Compaction in Cloud for High Yield

Tejas Chopra

Senior Software Engineer at Netflix

Tejas Chopra shares how Netflix gets massive volumes of media assets and metadata to the cloud fast and cost-efficiently.

Where Did All These Cycles Go?

Thomas Dullien

CEO of optimyze.cloud Inc.

Thomas Dullien (Optimyze.cloud) exposed all the hidden places where you can recover your wasted CPU resources.

Get Lower Latency and Higher Throughput for Java Applications

Simon Ritter

Deputy CTO at Azul Systems

Simon Ritter (Azul Systems) offers strategies for hitting p99 SLAs in Java — despite the various challenges presented by the…

What We Need to Unlearn about Persistent Storage

Pavel Emelyanov

Principal Software Engineer, ScyllaDB

Pavel Emelyanov (ScyllaDB) talks about ways to measure the performance of modern hardware and what it all means for database…

Avoiding Data Hotspots at Scale

Konstantin Osipov

Director of Software Engineering at ScyllaDB

Konstantine Osipov (ScyllaDB) addresses the tradeoffs between hash and range-based sharding.

Keeping Latency Low and Throughput High with Application-level Priority Management

Avi Kivity

CTO and Co-Founder of ScyllaDB

ScyllaDB CTO and co-founder Avi Kivity shows how high throughput and low latency can both be achieved in a single…

Using eBPF to Measure the k8s Cluster Health

Henrik Rexed

Cloud Native Advocate at Dynatrace

Henrik Rexed (Dynatrace) explains how to use Prometheus + eBPF to understand the inner behavior of Kubernetes clusters and workloads…

Continuous Go Profiling & Observability

Felix Geisendörfer

Staff Engineer at Datadog

Felix Geisendörfer (Datadog) digs into the unique aspects of the Go runtime and interoperability with tools like Linux perf and…

Unikraft: Fast, Specialized Unikernels the Easy Way

Felipe Huici

Chief Researcher at NEC Europe Laboratories GmbH

Felipe Huici (NEC Laboratories Europe) showcases the utility and design of UnikraftSDK.

Understanding Apache Kafka P99 Latency at Scale

Pere Urbón-Bayes

Senior Solutions Architect at Confluent

Pere Urbón-Bayes (Confluent) presents strategies for measuring, evaluating, and optimizing the performance of an Apache Kafka-based infrastructure.

High-Performance Networking Using eBPF, XDP, and io_uring

Bryan McCoid

Sr. Distributed Systems Engineer, Couchbase Inc.

Bryan McCoid outlines the ins and outs of Linux kernel tools such as io_uring, eBPF, and AF_XDP and how to…

DB Latency Using DRAM + PMem in App Direct & Memory Modes

Doug Hood

Consulting Member of Technical Staff at Oracle

Doug Hood (Oracle) compares the latency of DDR4 DRAM to that of Intel Optane Persistent Memory for in-memory database access.

Rust, Wright’s Law, and the Future of Low-Latency Systems

Bryan Cantrill

CTO of Oxide Computer Company

Bryan Cantrill on the rise of Rust-based systems, and the ceding of Moore’s Law to Wright’s Law and explain why…

Whoops! I Rewrote It in Rust

Brian Martin

Software Engineer at Twitter

Why and how Brian Pelikan rewrote Pelikan, Twitter’s open source and modular framework for in-memory caching, in Rust.

Let’s Fix Logging Once and for All

Peter Portante

Senior Principal Software Engineer at Red Hat

Peter Portante (Red Hat) presents a Linux kernel modification that gives the SRE and logging source owner greater control over…

Using SLOs for Continuous Performance Optimizations of Your k8s Workloads

Andreas Grabner

DevOps Activist at Dynatrace

Andreas Grabner (Dynatrace) shares how to use the CNCF Keptn project to automate SLO-based Performance Analysis as part of your…

Vanquishing Latency Outliers in the Lightbits LightOS Software Defined Storage System

Abel Gordon

Chief Systems Architect at Lightbits Labs

Abel Gordon’s overview on how Lightbits LightOS improves latency of high performance low latency NVMe based storage accessed over standard…