Virtual Event | OCTOBER 22 + 23, 2025

All Things Performance

The event for developers who care about high-perfomance, low-latency applications

Filter Videos

Browse our library of talks on low-latency engineering strategies.

ChatGPT Ain’t Got $%@& On Me!

Andy Pavlo

Associate Professor at Carnegie Mellon University
​This talk presents research on a new generation of autonomous tuning agents that optimize more parts of a database in…

Clickhouse’s C++ & Rust Journey

Alexey Milovidov

CTO at ClickHouse, Inc.
A full rewrite from C++ to Rust or gradual integration with Rust libraries? For a large C++ codebase, only the…

LLM Inference Optimization

Chip Huyen

LLM Inference Optimization
This talk will discuss why LLM inference is slow and key latency metrics. It also covers techniques that make LLM…

The Gory Details of a Full-Featured Userspace CPU Scheduler

Avi Kivity

CTO and Co-Founder of ScyllaDB
Userspace CPU schedulers, which often accompany asynchronous I/O engines like io_uring and Linux AIO, are usually simplistic run-to-completion FIFO loops.…

Building Planet-Scale Streaming Apps: Proven Strategies with Apache Flink

Sanchay Javeria

Senior Software Engineer at Pinterest
At Pinterest, we use Apache Flink for streaming applications, powering various use cases like real-time metrics reporting for ads from…

P99 & Me

Dor Laor

CEO of ScyllaDB
At ScyllaDB, P99 is part of our DNA. Predictable low latency has been a hard requirement ever since we launched…

Migrating from Postgres to ScyllaDB for Handling a Billion Rows

Abdurohman Abdurohman

Engineering Manager at tiket.com
At tiket.com, managing billions of hotel inventory rows pushed PostgreSQL to its limits, with write delays of up to four…

Performance Insights Beyond P99: Tales from the Long Tail

Rachel Stephens

Research Director at Red Monk

Adrian Cockroft

Tech Advisor at Nubank
Beyond the P99” moments are the rare, unpredictable outliers that disproportionately affect performance, reliability, and user experience. In this session,…

Let’s Do a Lot of Fuzzing in the Cloud

Alex Pshenichkin

Principal Engineer, Tech Lead at Antithesis
How do you simulate fast enough to do DST? Performance is a real hurdle to successfully implementing deterministic simulation, because…

As Fast as Possible, But Not Faster: ScyllaDB Flow Control

Nadav Har’El

Distinguished Engineer at ScyllaDB
Pushing requests faster than a system can handle results in rapidly growing queues. If unchecked, it risks depleting memory and…

Reworking the Neon IO stack: Rust+tokio+io_uring+O_DIRECT

Christian Schwarz

Member of Technical Staff at Databricks
Neon is a serverless Postgres platform. Recently acquired by Databricks, the same technology now also powers Databricks Lakebase. In this…

Cost Effective, Low Latency Vector Search In Databases: A Case Study with Azure Cosmos DB

Magdalen Manohar

Senior Researcher at Microsoft
We’ve integrated DiskANN, a state-of-the-art vector indexing algorithm, into Azure Cosmos DB NoSQL, a state-of-the-art cloud-native operational database. Learn how…

Squashing the Heisenbug with Deterministic Simulation Testing

Dominik Tornow

CEO at Resonate HQ
Distributed systems are hard to test, with bugs that are difficult to reproduce and diagnose. This talk introduces Deterministic Simulation…

A Visual Journey Through Async Rust

Alex Puschinsky

Tech Lead Software Engineer at Trigo
Async programming is tricky, but tinkering and visualization make it click. In this talk, we’ll build an async Rust visualization…

8x Better Than Protobuf: Rethinking Serialization for Data Pipelines

Almog Gavra

Co-Founder at Responsive
This session introduces Imprint, a serialization format built from first principles for data pipelines. Learn the motivations behind introducing “yet…

Reliability vs Memory Efficiency of eBPF Instrumentation: Why Not Both?

Dom Delnano

CEO of Cosmic
Zero-instrumentation eBPF tooling promises instant insights, but making it reliable and lightweight is a serious challenge. This talk shows how…

Translations at Scale: Memory Optimization Techniques That Kept Uber’s P99 Under 1ms

Cristian Velazquez

Staff Site Reliability Engineer at Uber
Discover how Uber reimagined its translation service by shifting from a purely in-memory approach to a hybrid memory-disk architecture. Learn…

Rivian’s Push Notification Sub Stream with Mega Filter

Marcus Kim

Software Engineer II at Rivian and VW Group Technology, LLC

Saahil Khurana

Staff Software Engineer at Rivian and VW Group Technology, LLC
Rivian vehicles stream over 5500 signals every 5 seconds, but only about 80 are relevant for push notifications. Without filtering,…

Uplevel pgBadger: The Potential of Postgres for Log Analysis

Henrietta Dombrovskaya

Database Architect at DRW
pgBadger is a well-known and widely used tool for identifying performance bottlenecks when using PostgreSQL. But there is even more…

Timeseries Storage at Ludicrous Speed

Duarte Nunes

Staff Engineer at Datadog
Datadog’s real-time storage system for timeseries data ingests billions of points per second and serves thousands of queries per second…

Squeezing Every Millisecond: How We Rebuilt the Datadog Lambda Extension in Rust

AJ Stuyvenberg

Staff Engineer at Datadog
Datadog had a cold start problem. Hear how we crushed the Lambda cold start latency, and how you can as…

LLM KV Cache Offloading: Analysis and Practical Considerations

Eshcar Hillel

Principal Research Scientist at Pliops
LLM deployments are driving massive GPU demand and cost. This talk presents a generic architecture for offloading KV-cache tensors to…

Fast and Deterministic Full Table Scans at Scale

Felipe Cardeneti Mendes

Technical Director at ScyllaDB
ScyllaDB’s new tablet replication algorithm replaces static vNodes with dynamic, elastic data distribution that adapts to shifting workloads. This talk…

Unlocking Code-level Performance Analysis with PMUv3 Plugin

Gayathri Narayana Yegna Narayanan

Senior Solutions Engineer at Arm, Inc.
Profiling code sections on ARM platforms is hard with whole-application tools. This talk introduces the PMUv3 plugin, which automates metric…

Finding Performance Needles in Haystacks with APerf

Geoffrey Blake

Principal Engineer at AWS
Finding performance issues in modern software is like finding a needle in a haystack and intuition on where to look…

xCapture v3: Efficient, Always-On Thread Level Observability with eBPF

Tanel Poder

Performance Nerd at PoderC LLC
xCapture is an eBPF-based Linux thread activity measurement tool for systematic performance troubleshooting and optimization that can go very deep.…

Why We’re Rewriting SQLite in Rust

Glauber Costa

Founder & CEO of Turso
As we were adding Vector Search to SQLite, we had a crazy idea. What could we achieve if we were…

Designing an Energy-efficient Architecture for Geo Databases

Yichen Wei

Engineer Manager at Disney+/Hulu
The cost and efficiency of IP geolocation data services depends on careful system design. This talk covers how storage architecture,…

Achieving Sub-10 Millisecond Latencies at Climatiq

Gustav Wengel

Software Developer at Climatiq
This is a case study of how Climatiq achieves sub-10ms latencies, by shifting work to build-time, using zero-copy deserialization to…

P99 Latency at 32 Million Concurrent Streams

Ashutosh Agrawal

Staff Software Engineer at Gemini

Tim Koopmans

Senior Director, Product Experience at ScyllaDB
Ashutosh and Tim talk about all things related to glass-to-glass latency: live streaming of sports to 32 million concurrent devices…

Designing Low-Latency Systems with TLA+

Hillel Wayne

Consultant at Windy Coast Consulting
Many costly bugs come not from code, but from flawed designs: a common challenge in complex high-performance systems. TLA+ lets…

We Told B+ Trees to Do Sorted Sets—They Nailed It

Joe Zhou

Developer Advocate at DragonflyDB
Sorted sets power critical use cases like leaderboards and priority queues, but Redis’s skiplist implementation pays a steep tax—37 bytes…

KV Caching Strategies for Latency-Critical LLM Applications

John Thomson

Deep Learning Algorithms Engineer at University of Waterloo
LLM inference is split into two phases: Prefill and Decode. The Prefill phase fills the KV Cache with the context,…

Faster Than a Blink: Globally Distributed Wasm Functions on Akamai

Kate Goldenring

Senior Software Engineer at Fermyon Technologies, Inc.
Users start noticing lag after just 100 milliseconds (a blink of an eye) making latency a critical challenge for modern,…

Mechanical Sympathy in Cooperative Multitasking

Kenny Chamberlin

Lead Engineer at Momento
This talk applies mechanical sympathy to server workloads that use cooperative multitasking and async/await. We’ll cover three techniques: reducing thread…

Shared Nothing Databases at Scale

Nick Van Wiggeren

CTO at PlanetScale
This talk will discuss how PlanetScale scaled databases in the cloud, focusing on a shared-nothing architecture that is built around…

Building Scalable End-to-End Latency Metrics from Distributed Trace

Kusha Maharshi

Senior Software Engineer at Bloomberg
In our sprawling microservices architecture at Bloomberg, timing requests from point A to point Z means navigating an alphabet’s worth…

The Power of Small Optimizations

Maksim Kita

Principal Software Engineer at Tinybird
This session will cover some important small optimizations that I contributed to ClickHouse over the last years — optimizations that…

GPUS and How to Program Them

Manya Bansal

PhD Student at Massachusetts Institute of Technology
CUDA, designed as an extension to C++, preserves its familiar abstractions. However, unlike CPU programming — where compilers and runtime…

Parsing Protobuf as Fast as Possible

Miguel Young de la Sota

Engineer at Buf Technologies
Protobuf is an extremely popular binary data interchange format. This session dives into hyperpb, a Protobuf parser for Go that…

Optimizing Tiered Storage for Low-Latency Real-Time Analytics

Neha Pawar

Founding Engineer and Head of Data at StarTree
Real-time OLAP databases usually trade performance for cost when moving from local storage to cloud object storage. This talk shows…

Push the Database Beyond the Edge

Nikita Sivukhin

Software Engineer at Turso
Almost any application can benefit from having data available locally – enabling blazing-fast access and optimized write patterns. This talk…

ZGC: A Decade of Innovation

Stefan Johansson

Principle Member of Technical Staff at Oracle
ZGC has been in development for more or less a decade now. This talk will explore the current state of…

A Deep Dive into the Seastar Event Loop

Pavel Emelyanov

Principal Software Engineer at ScyllaDB
The core and the basis of ScyllaDB’s outstanding performance is the Seastar framework, and the core and the basis of…

A Java Developer’s Quest for I/O Performance

David Vlijmincx

Java Developer at JPoint
My journey optimizing Java’s io_uring bindings taught me what performance truly means. Through misleading benchmarks, midnight debugging sessions, and countless…

Rethinking Durable Workflows and Queues: A Library-based Approach

Qian Li

Co-Founder at DBOS, Inc
Durable workflow engines typically depend on external orchestration, adding overhead, write amplification, and complexity. This talk presents an alternative: a…

Go Faster: Tuning the Go Runtime for Latency and Throughput

Paweł Obrępalski

Staff Engineer at ShareChat
Most Go services don’t need runtime tuning…until they do. At ShareChat, running hundreds of Go services across thousands of cores,…

40x Faster Binary Search

Ragnar Groot Koerkamp

PhD at ETH Zurich
This talk will first expose the lie that binary search takes O(lg n) time — it very much does not!…

Apache Flink at Scale: 7x Cost Reduction in Real-Time Deduplication

Andrei Manakov

Senior Staff Software Engineer at ShareChat

Sanyam Gupta

Software Engineer at ShareChat
ShareChat processes over 200K requests per second, and our Node.js + Redis deduplication system could not scale cost-effectively. We migrated…

Deprecating Distributed Locks for Low Latency Event Consumption

Danish Rehman

Staff Software Engineer at Attentive
This session covers how Attentive scaled mutual exclusivity in its event-driven architecture by evolving from Redis-based locks to Apache Pulsar’s…

Hunting a Kernel Allocation Bug Triggered by io_uring

Raphael Carvalho

Senior Software Engineer at ScyllaDB
The hunting saga started with a system failure in a ScyllaDB test suite, triggered by its usage of io-uring. We…

Building a Fast Lock-free Queue for Trading Systems

Sarthak Sehgal

Market Making Tech Lead at Maven Securities
When every microsecond counts, inter-thread communication must be lean and predictable. This talk dives into the design of a high-performance…

Scaling to 6.6M Read OPS with ScyllaDB on Kubernetes: Achieving Sub-2ms Latency and Robust Recovery

Shubham Sharma

Senior Systems Engineer at Verve Gorup
Learn how we achieved 6.6M read OPS with sub-2ms latency on a Single ScyllaDB cluster in Kubernetes, optimizing machine types,…

Engineering a Low-Latency Vector Search Engine for ScyllaDB

Pawel Pery

Senior Software Engineer at ScyllaDB
Implementing Vector Search in ScyllaDB brings challenges from low-latency to predictable performance at scale. Rather than embedding HNSW indexing directly…

Measuring Query Latency the Hard Way: An Adventure in Impractical Postgres Monitoring

Simon Notley

Observability and Optimization at EnterpriseDB
Sampling the session state (as exposed by pg_stat_activity) is a surprisingly powerful way to understand how your Postgres instance spends…

What Would You Give for Speed: Trade-offs in Eventually Consistent Systems at Fly.io

Somtochi Onyekwere

Software Engineer at Fly.io
Drawing on examples from engineering Corrosion (an source distributed system designed to handle fast eventual consistency across Fly.io’s cloud platform),…

Design Considerations for P99-optimized Hash Tables

Steve Heller

President at Chrysalis Software Corp.
Hash tables are a classic data structure but struggle in P99-optimized applications, especially with variable-length records. Open addressing works well…

From Gatekeeper to Kyverno : Kubernetes Policy Management with Performance

Tanat Lokejaroenlarb

Staff Site Reliability Engineer at Adevinta
This talk shares our journey migrating from Gatekeeper to Kyverno for Kubernetes policy management at Adevinta. Faced with the need…

The Tale of Taming TigerBeetle’s Tail Latency

Tobias Ziegler

Software Engineer at Tigerbeetle
Learn how we reduced TigerBeetle’s tail latency through algorithm engineering. ‘Algorithm engineering goes beyond studying theoretical complexity and considers how…

Netflix’s Scalable Page Construction with Real-Time Impression History

Tulika Bhatt

Senior Software Engineer at Netflix

Saurabh Jaluka

Senior Software Engineer at Netflix
Netflix built a Page Construction Architecture to deliver scalable, efficient, and personalized experiences across devices from phones to TVs. This…

Bridging epoll and io_uring in Async Rust

Tzu Gwo

Co-Founder & CEO at Tonbo IO Inc.
Tokio dominates async Rust, but its epoll-based model makes it hard to adopt io_uring. This talk explains why async Rust’s…

Turbocharging MCP: Speed, Smarts, and Scale

Viraj Sharma

Student at Presidium School Delhi India
Learn how to speed up Model Context Protocol (MCP) tools using async servers, caching, batching, and smart data handling—making your…

Concurrency Testing using Custom Linux Schedulers

Jake Hillion

Software Engineer at Meta

Johannes Bechberger

Software Engineer at SAP
New features of the Linux kernel allow us to develop our own Linux schedulers. This talk shows how anyone can…

Building a High-Performance CI Cloud from the Ground Up

Aditaya Maru

Co-Founder of Blacksmith
Every growing engineering org faces the same challenges when self-hosting CI. This talk shares lessons from building a high-performance CI…

Taming the Tail: Engineering Predictable Latency in Serverless Event Processing

Amit Anand

Staff Engineer at PayPal Inc

Jubin Abhishek Soni

Senior Software Engineer at Yahoo Inc

Rajesh Kumar Pandey

Principal Engineer at Amazon
Serverless platforms scale beautifully until they don’t. In event-driven architectures, unpredictable P99 latencies often emerge from cold starts, retries, uneven…

Patterns of Low Latency

Pekka Enberg

Founder & CTO at Turso
Building for low latency is important, but the tips and tricks are often part of developer folklore and hard to…

DTrace at 21: Reflections on Fully-grown Software

Bryan Cantrill

CTO of Oxide Computer Company
Twenty one years ago, DTrace was integrated into the operating system. By any measure, the software is now fully-grown: it…

Rust + io_uring + ktls: How Fast Can We Make HTTP?

Amos Wenger

Creator of Faster Than Lime
Working on Fluke: async Rust HTTP1+2 with io_uring & kTLS, sponsored by fly.io & Shopify. Unlike others, Fluke is built…

The Next Chapter in the Sordid Love/Hate Relationship Between DBs and OSes

Andy Pavlo

Associate Professor at Carnegie Mellon University
DBMSs struggle with OS constraints, but new tech like eBPF can change the game. Join us to explore “user-bypass” designs…

Zero-overhead Container Networking with eBPF and Netkit

Liz Rice

Chief Open Source Officer, Isovalent at Cisco
Introducing Netkit: a new eBPF enhancement replacing veth connections in container networking. Say goodbye to the overhead slowing down container…

Noisy Neighbor Detection with eBPF

Jose Fernandez

Senior Software Engineer at Netflix
Tackling “noisy neighbor” issues in multi-tenant setups! At Netflix, we use eBPF to monitor and mitigate excessive CPU usage in…

Rust: A Productive Language for Writing Database Applications

Carl Lerche

Principal Engineer at AWS
Think Rust is just about performance and safety? Let’s talk productivity. Last year, Rust’s library ecosystem needed work. What’s changed?…

Designing a Query Queue for ScyllaDB

Avi Kivity

CTO and Co-Founder of ScyllaDB
Database queries vary widely—from milliseconds to hours. Optimizing concurrency is a delicate balance of CPU, memory, and stability. Bad design…

You’re Doing It All Wrong

Michael Stonebraker

CTO & Co-founder of DBOS
Historically, business apps use a three-tier architecture. Now, cloud-native architectures and DBMS can be combined, allowing for resilient, cost-effective, and…

1BRC – Nerd Sniping the Java Community

Gunnar Morling

Principal Software Engineer at Decodable
Gunnar Morling dives into the tricks that the fastest 1BRC solutions used to process the challenge’s 13 GB input file…

Overcoming Distributed Databases Scaling Challenges with Tablets

Dor Laor

CEO of ScyllaDB
Maximizing performance goes beyond server-level tweaks. Even low level code, scaling requires more. In this session, learn about “tablets”—a dynamic…

The Performance Engineer’s Toolkit: A Case Study on Data Analytics with Rust

Will Crichton

Assistant Professor at Brown University
I optimized a Python data analytics pipeline, making it 180,000x faster with Rust! Using compiler optimizations, data structures, vectorization, parallelization,…

Using Sketching Technology to Optimize Services with Fewer Resources

Yichen Wei

Engineer Manager at Disney+/Hulu
Optimize your services with cost-efficient observability using high-performance sketching tools. Dive into creating sketching tech for various scenarios, making the…

Using eBPF Off-CPU Sampling to See What Your DBs are Really Waiting For

Tanel Poder

Performance Nerd at PoderC LLC
At last year’s P99 CONF, Tanel introduced using eBPF Task State Arrays to track Linux apps’ thread states/activity without built-in…

Java Heap Memory Optimization to Improve P99 Query Latency at Linkedin Scale

Vivek Iyer Vaidyanathan

Staff Software Engineer at LinkedIn
Discover how LinkedIn optimized Apache Pinot’s performance! By using FALF Interning, a home-grown, lock-free method, they cut JVM heap usage…

Just In Time LSM Compaction

Aleksei Kladov

Staff Software Engineer at TigerBeetle
Matklad dives into the implementation of TigerBeetle’s JIT compaction algorithm for LSM, which is highly concurrent and uses all available…

Redis Alternatives Compared

Peter Zaitsev

Founder of Percona, Coroot, FerretDB
Join Peter as he dives into Redis alternatives like Valley, DragonflyDB, and Microsoft Garnet. He’ll cover licensing, features, community support,…

Detecting Memory Leaks in Android A/B Tests: A Production-Focused Approach

Pavlo Stavytskyi

Google Developer Expert
Discover how to detect subtle memory leaks and regressions in Android apps with a production-focused approach. Learn the key metrics…

One Billion Row Challenge in Golang

Shraddha Agrawal

Senior Software Engineer, Ceph, IBM
Join us as we tackle Gunnar Morling’s One Billion Rows Challenge in Golang! We’ll walk through optimizing a 16GB file…

Taming Discard Latency Spikes

Patryk Wróbel

Software Engineer at ScyllaDB
Learned a crucial lesson on read/write latency when fixing a real ScyllaDB issue! Discover how TRIM requests impact NVMe SSDs…

Why Databases Cache, but Caches Go to Disk

Felipe Cardeneti Mendes

Technical Director at ScyllaDB

Alan Kasindorf

Founder of Cache Forge
ScyllaDB teamed up with Memcached to compare how caches and databases handle storage and memory across different scenarios. We’ll dive…

Primitive Pursuits: Slaying Latency with Low-Level Primitives and Instructions

Ravi A Giri

Senior Principal Engineer at Intel

Harshad S Sane

Principal Software Engineer at Intel
This talk showcases a methodology with examples to break down applications to low-level primitives and identify optimizations on existing compute…

How to Improve Your Ability to Solve Complex Performance Problems: Part 2

Kerry Osborne

Google Database Black Belt Team Lead at Google
In Part 2 of my P99 2023 talk, I’ll dive into practical strategies to enhance our problem-solving skills in the…

Database Drivers: Performance Perspectives

Piotr Sarna

Founding Engineer at poolside
Unlock the full potential of database drivers! Dive deep into their design, uncover how they work under the hood, and…

Low-Latency Mesh Services Using Actors

Nikita Lapkov

Senior Software Engineer
We’re transforming elfo, our Rust actor system, into a distributed mesh of services. Learn how we tackled message serialization, compression,…

Minimizing Request Latency of Self-Hosted ML Models

Julia Kroll

Applied Engineer at Deepgram
Join our session on minimizing latency in self-hosted #ML models in cloud environments. Learn strategies for deploying Deepgram’s speech-to-text models…

Using Change Point Detection to Fight Noisy Benchmark Results

Matt Fleming

Co-Founer & CTO at Nyrkiö Oy
Discovering performance regressions in modern systems is tough due to inevitable noise. Change Point Detection (CPD) algorithms are gaining traction…

Enhancing P99 Latency: Strategies for Doubling/Tripling Performance in Third-Party APIs

Cristian Velazquez

Staff Site Reliability Engineer at Uber
Sharing our journey to improve P99 latency in third-party APIs. From optimizing network configs to fine-tuning connection management, we aimed…

Understanding Request Latency with Wallclock Profiling

Richard Startin

Senior Software Engineer at Datadog
Analyzing request latency is tough since it’s not always CPU-bound. Many devs give up on CPU profiling, but sampling profilers…

Fast, Secure and Dense: Finally Serverless with WebAssembly

Thorsten Hans

Sr. Cloud Advocate at Fermyon Technologies
Discover how WebAssembly is revolutionizing cloud computing. Join Thorsten Hans to learn about building serverless apps with Spin, achieving true…

Latency, Throughput & Fault Tolerance: Designing the Arroyo Streaming Engine

Micah Wylde

Co-Founder at Arroyo
Arroyo is a Rust-based, distributed stream processing engine offering millisecond-latency and high-throughput. It achieves fault tolerance and exactly-once processing via…

Get Low (Latency)

Benjamin Cane

Distinguished Engineer at American Express

Tyler Wedin

Vice President, Global Payments Network SRE at American Express
Building a real-time, low-latency card payments system is a challenge. Join the Amex Payments Network team to learn about their…

Reliable Data Replication

Cameron Morgan

Staff Infrastructure Engineer at Shopify
Data replication ensures high availability—reliable, consistent, and timely access. Dive into the tough problems often skipped: reliable backfills, schema changes,…

Scheduler Tracing With ftrace + eBPF

Jason Rahman

Principal Software Engineer at Microsoft
Dive into understanding app latency by exploring the Linux scheduler with ftrace, eBPF, and Perfetto for visualization. Uncover quirks in…

Aiding the CUDA Compiler for Fun and Profit

Joe Rowell

Founding Engineer at poolside
Get the most out of your CUDA code by understanding how the compiler works.

Building a Cloud Native LSM on Object Storage

Chris Riccomini

Creator of Materialized View

Rohan Desai

Co-Founder of Responsive
Excited to introduce SlateDB, an open-source, cloud-native storage engine. Built as an LSM on object stores like S3/GCS/ABS, it leverages…

Cheating the Cloud: 50% Savings with Compression Dictionaries

Łukasz Paszkowsk

Software Engineer Team Lead at ScyllaDB
Faced with high networking costs, we tackled insufficient compression with a custom RPC compressor using ZSTD and external dictionary support.…

Internet-Scale Semantic, Structural, and Text Search in Real Time

Ash Vardanian

Founder of Unum Cloud
Discover powerful search algorithms and their SIMD- and GPU-accelerated implementations for AI-powered semantic search, structure search, or exact & fuzzy…

Writing a Kernel in Rust: Code Quality and Performance

Luc Lenôtre

Site Reliability Engineer at Clever Cloud
Maestro kernel began as a C-based school project and transitioned to Rust for better code quality. Now, it’s in a…

Running Low-Latency Workloads on Kubernetes

Jimmy Zelinskie

Co-Founder of AuthZed
Configuring Kubernetes for optimal workload performance is a continuous journey. Best practices can sometimes harm performance. Join us as we…

Distributed Async Await: A New Programming Model for the Cloud

Dominik Tornow

CEO at Resonate HQ
Dive into the future of cloud dev with Distributed Async Await. Simplify your code and conquer the chaos of distributed…

Feature Store Evolution Under Cost Constraints: When Cost is Part of the Architecture

David Malinge

Senior Staff Software Engineer at ShareChat

Ivan Burmistrov

Principal Software Engineer at ShareChat
ShareChat’s scaling ML Feature Store to handle 1B features/sec was just the start. Next challenge: cutting costs while keeping quality.…

WebAssembly on the Edge: Sandboxing AND Performance

Brian Sletten

President at Bosatsu Consulting, Inc.

Ramnivas Laddad

Co-Founder of Exograph, Inc
Moving apps to the Edge can complicate performance due to security constraints. Learn how WebAssembly bridges the gap, enabling both…

Queues, Hockey Sticks and Performance

David Collier-Brown

Staff Engineer
Queues: both a blessing and a curse in computer science. They help predict performance but also signal overload. This talk…

Taming Tail Latencies in Apache Pinot with Generational ZGC

Christopher Peck

Senior Software Engineer at Uber
Discover how Generational ZGC slashed Java app pause times in real-world use! Learn how Apache Pinot tackled scatter-gather tail latencies…

Measuring and Diagnosing Performance Shouldn’t Require Magic

Cary Millsap

Distinguished Product Manager at Oracle
Struggling with performance issues despite all green dashboards? Experts say you need special skills, but we’ll show you how to…

Remote CAD that Feels Local

Adam Chalmers

Systems Engineer at Zoo

Adam Sunderland

Lead Cloud Infrastructure Engineer at Zoo
Zoo is creating a CAD suite that runs in the cloud but feels like it’s local. How? Regional deployment, WebRTC…

Profiling your Go Service with pprof

Miriah Peterson

Lead Engineer at Soypete Tech
Optimize your Go code with the powerful pprof tool. Learn how to integrate, access, and interpret pprof metrics, plus best…

Performance Pitfalls of Rust Async Function Pointers (And Why It Might Not Matter)

Byron Wasti

Founder & CEO
An in-depth analysis of asynchronous function pointers in Rust, why they aren’t a real thing (compared to normal function pointers)…

Elevating PostgreSQL: Benchmarking Vector Search Performance

Daniel Seybold

Co-Founder at benchANT
PostgreSQL continues to evolve with vector search extensions like pgvector and pgvecto.rs. We’ll explore recent benchmarks comparing vector search performance…

Sight Beyond Sight: See it All Through Observability

Leandro Melendez

Developer Advocate at Grafana Labs
Observability is more than metrics and logs—it’s knowing your system’s status without checking under the hood. From QA processes to…

Time-Series and Analytical Databases Walk Into a Bar…

Andrei Pechkurov

Core Engineer at QuestDB
In this talk, we share our journey in making QuestDB, an open-source time-series database, a much faster analytical database, featuring…

Profile-Guided Optimization (PGO): (Ab)using it for Fun and Profit

Aliaksandr Zaitsau

Solution Architect
Discover how to boost your software with lesser-known compiler flags and Profile-Guided Optimization (PGO). Learn what PGO is, how it…

How a Failed Experiment Helped Me Understand the Go Runtime in More Depth

Aadhav Vignesh

Software Engineer
In 2022, I began crafting a tool to visualize Go’s GC in real-time. I’ll dive into the hurdles of extracting…

What C and C++ Can Do and When Do You Need Assembly?

Alexander Krizhanovsky

CEO at Tempesta Technologies
Join us to dive into GCC and Clang optimizations for C/C++! We’ll explore how x86-64 executes code, use assembly for…

Low Latency Gal Presents: Low Latency Stuff

Sonia Kolasinska

Low Latency Gal
Lock-free programming and precise ultra low latency pipelining between CPU cores.

Cache Me If You Can: How Grafana Labs Scaled Up Their Memcached 42x & Cut Costs Too

Danny Kopping

Senior Software Engineer at Grafana Labs
Our cloud database stores billions of files in object storage. With petabytes of data being queried every day, we started…

High Performance on a Low Budget

Gwen Shapira

Co-founder & CPO of Nile
It is one thing to solve performance challenges when you have plenty of time, money, and expertise available. Many performance…

From 1M to 1B Features Per Second: Scaling ShareChat’s ML Feature Store

Andrei Manakov

Senior Staff Software Engineer at ShareChat

Ivan Burmistrov

Principal Software Engineer at ShareChat
ShareChat’s Ivan Burmistrov and Andrei Manakov walk through how they built a low latency ML Feature Store based on ScyllaDB which…

Corporate Open Source Anti-Patterns: A Decade Later

Bryan Cantrill

CTO of Oxide Computer Company
A little over a decade ago, I gave a talk on corporate open source anti-patterns, vowing that I would return…

Quantifying the Performance Impact of Shard-per-core Architecture

Dor Laor

CEO of ScyllaDB
Most software isn’t architected to take advantage of modern hardware. How does a shard-per-code and shared-nothing architecture help – and…

How Netflix Builds High Performance Applications at Global Scale

Prasanna Vijayanathan

Senior Software Engineer at Netflix
We all want to build applications that are blazingly fast. We also want to scale them to users all over…

eBPF vs Sidecars

Liz Rice

Chief Open Source Officer, Isovalent at Cisco
From its vantage point in the kernel, eBPF provides a platform for building a new generation of infrastructure tools for…

Taming P99 Latencies at Lyft: Tuning Low-Latency Online Feature Stores

Bhanu Renukuntla

Senior Software Engineer at Lyft
In this talk, we will explore the challenges and strategies of tuning low latency online feature stores to tame the…

Running a Go App in Kubernetes: CPU Impacts

Teiva Harsanyi

Senior Software Engineer at Google
Understanding the impacts of running a containerized Go application inside Kubernetes with a focus on the CPU.

Expanding Horizons: A Case for Rust Higher Up the Stack

Carl Lerche

Principal Engineer at AWS
Historically associated with systems programming due to its roots in Mozilla, Rust’s promise of safety, speed, and concurrency has led…

How to Improve Your Ability to Solve Complex Performance Problems

Kerry Osborne

Google Database Black Belt Team Lead at Google
This talk is really about problem solving. It’s about how we think about problems and how we resolve those problems…

Square’s Lessons Learned from Implementing a Key-Value Store with Raft

Omar Elgabry

Software Engineer at Square
To put it simply, Raft is used to make a use case (e.g., key-value store, indexing system) more fault tolerant…

Performance Budgets for the Real World

Tammy Everts

Chief Experience Officer at SpeedCurve
Performance budgets have been around for more than ten years. Over those years, we’ve learned a lot about what works,…

A Deterministic Walk Down TigerBeetle’s main() Street

Aleksei Kladov

Staff Software Engineer at TigerBeetle
Learn how to use Zig to implement a fully deterministic distributed system which will never fail with an out of…

VM Performance: The Differences Between Static Partitioning or Automatic Tuning

Dario Faggioli

Virtualization Software Engineer at SUSE
Virtualized workloads are known to require carefully crafted configuration and tuning, both at the host and at the guest level,…

Measuring the Impact of Network Latency at Twitter

Widya Salim

Data Scientist at SEEK

Victor Ma

Senior Data Scientist at Airwallex

Zhen Li

Data Scientist at TikTok
Widya Salim, Victor Ma, and Zhen Li will outline the causal impact analysis, framework, and key learnings used to quantify…

Conquering Load Balancing: Experiences from ScyllaDB Drivers

Piotr Grabowski

Software Team Leader at ScyllaDB
Load balancing seems simple on the surface, with algorithms like round-robin, but the real world loves throwing curveballs. Join me…

Low-Latency Data Access: The Required Synergy Between Memory & Disk

Kriti Kathuria

Graduate Researcher at the University of Waterloo
Analytics has moved from internal dashboards to a dashboard inside the product, providing a personalized experience for each user, be…

Distributed System Performance Troubleshooting Like You’ve Been Doing it for Twenty Years

Jon Haddad

Founder at Rustyrazorblade Consulting
Troubleshooting performance issues across distributed systems can be intimidating if you don’t know where to start, and it’s even harder…

Writing Low Latency Database Applications Even If Your Code Sucks

Glauber Costa

Founder & CEO of Turso
All latency lovers are used to the mystical experience of joy coming from aligning data to a cache line size…

Using Libtracecmd to Analyze Your Latency and Performance Troubles

Steven Rostedt

Software Engineer at Google
Trying to figure out why your application is responding late can be difficult, especially if it is because of interference…

Building Low Latency ML Systems for Real-Time Model Predictions at Xandr

Chinmay Abhay Nerurkar

Principal Engineer at Microsoft
Xandr’s Ad-server handles over 400 billion daily ad requests from across the world wide web. Operating under a stringent Service…

ORM is Bad, But is There an Alternative?

Henrietta Dombrovskaya

Database Architect at DRW
It’s a well-known fact, that although the database performance is great, and each query is executed in milliseconds, the overall…

P99 Publish Performance in a Multi-Cloud NATS.io System

Derek Collison

Founder & CEO of Synadia
This talk will walk through the strategies and improvements made to the NATS server to accomplish P99 goals for persistent…

Making Python 100x Faster with Less Than 100 Lines of Rust

Ohad Ravid

Team Lead at Trigo
Python isn’t known as a low-latency language. Can we bridge the performance gap using a bit of Rust and some…

Zero Downtime Critical Traffic Migration @Netflix Scale

Abhishek Pandey

Senior Software Engineer at Meta
Picture yourself enthralled by the latest episode of your beloved Netflix series, delighting in an uninterrupted, high-definition streaming experience. Behind…

The History of Tracing Oracle

Cary Millsap

Distinguished Product Manager at Oracle
In this presentation, I will explore the history of tracing Oracle and why it has been overlooked despite its usefulness.…

Always-on Profiling of All Linux Threads, On-CPU and Off-CPU, with eBPF & Context Enrichment

Tanel Poder

Performance Nerd at PoderC LLC
In this session, Tanel introduces a new open source eBPF tool for efficiently sampling both on-CPU events and off-CPU events…

Cost-Effective Burst Scaling For Distributed Query Execution

Dan Harris

Principal Software Engineer at Coralogix
Building a query engine that scales efficiently is a difficult task. Queries over big datasets stored in Object Storage require…

Unconventional Methods to Identify Bottlenecks in Low-Latency and High-Throughput Data Pipelines

Zamir Paltiel

Head of Engineering at Hyperspace
In this presentation, we explore how standard profiling and monitoring methods may fall short in identifying bottlenecks in low-latency data…

Mitigating the Impact of State Management in Cloud Stream Processing Systems

Yingjun Wu

CEO of RisingWave Labs
Stream processing is a crucial component of modern data infrastructure, but constructing an efficient and scalable stream processing system can…

Practical Go Memory Profiling

William Kennedy

Managing Partner at Ardan Labs
In this talk, Bill will show you how to use benchmark profiling in and compiler directives in Go to find…

Adventures in Thread-per-Core Async with Redpanda and Seastar

Travis Downs

Software Engineer at Redpanda
Thread-per-core programming models are well known in software domains where latency is important. Pinning application threads to physical cores and…

Architecting a High-Performance (Open Source) Distributed Message Queuing System in C++

Vitaly Dzhitenov

Senior Software Engineer at Bloomberg
BlazingMQ is a new open source* distributed message queuing system developed at and published by Bloomberg. It provides highly-performant queues…

Noise Canceling RUM

Tim Vereecke

Web Performance Architect at Akamai
Noisy Real User Monitoring (RUM) data can ruin your P99! We introduce a fresh concept called “Human Visible Navigations” (HVN)…

Less Wasm

Piotr Sarna

Founding Engineer at poolside
The presentation explains why getting rid of WebAssembly is good for your latency. More specifically, it’s a short case study…

Reducing P99 Latencies with Generational ZGC

Stefan Johansson

Principle Member of Technical Staff at Oracle
With the low-latency garbage collector ZGC, GC pause times are no longer a big problem in Java. With sub-millisecond pause…

5 Hours to 7.7 Seconds: How Database Tricks Sped up Rust Linting Over 2000X

Predrag Gruevski

Independent Software Researcher at Trustfall
Linters are a type of database! They are a collection of lint rules — queries that look for rule violations…

Interaction Latency: Square’s User-Centric Mobile Performance Metric

Pierre-Yves Ricau

Android Distinguished Engineer at Block
Mobile performance metrics often take inspiration from the backend world and measure resource usage (CPU usage, memory usage, etc) and…

Chihuahua-Sized Load Tests!

Leandro Melendez

Developer Advocate at Grafana Labs
Because bigger isn’t always better. Especially nowadays.Do your teams need help accommodating those humongous load tests in your agile &…

How to Avoid Learning the Linux-Kernel Memory Model

Paul McKenney

Software Engineer at Meta
The Linux-kernel memory model (LKMM) is a powerful tool for developing highly concurrent Linux-kernel code, but it also has a…

MySQL Performance on Modern CPUs: Intel vs AMD vs ARM

Peter Zaitsev

Founder of Percona, Coroot, FerretDB
For years CPU choice for MySQL was pretty boring – just chose what Intel Made CPU you want. In recent…

How We Reduced the Startup Time for Turo’s Android App by 77%

Pavlo Stavytskyi

Google Developer Expert
The startup time of a mobile app is one of the most important indicators of its performance and has a…

99.99% of Your Traces are Trash

Paige Cruz

Senior Developer Advocate at Chronosphere
Distributed tracing is still finding its footing in many organizations today, one challenge to overcome is the data volume –…

High-Level Rust for Backend Programming

Adam Chalmers

Systems Engineer at Zoo
Some people say you should only use Rust where you can’t afford to use garbage collection. I disagree — Rust…

A Deep Dive Into Concurrent React

Matheus Albuquerque

Senior Software Engineer, Front-End at Medallia
Writing fluid user interfaces becomes more and more challenging as the application complexity increases. In this talk, we’ll explore how…

Ingesting in Rust

Armin Ronacher

Creator of Flask and Principal Architect at Sentry
At Sentry we handle hundreds of thousands of events a second — from tiny metric to huge memory dump. What…

The Latency Stack: Discovering Surprising Sources of Latency

Mark Gritter

Principal Engineer at Postman
Usually, when an API call is slow, developers blame ourselves and our code. We held a lock too long, or…

Building a 10x More Efficient Edge Platform

Felipe Huici

CEO and Co-Founder of Unikraft UG
Painful cold boots, terrible auto-scale times, minutes-long waits for compute nodes to be up: these are standard headaches that cloud…

Beyond Availability: The Seven Dimensions for Data Product SLOs

Emily Gorcenski

Principal Data Scientist at Thoughtworks
In the software world, we’re used to SLOs built around latency and availability. But in the data engineering universe, there…

Peak Performance at the Edge: Running Razorpay’s High-Scale API Gateway

Jay Pathak

Software Development Engineer at Razorpay
Razorpay caters to millions of API requests every day that are non-uniform in nature. As a key provider of financial…

Segment-Based Storage vs. Partition-Based Storage: Which is Better for Real-Time Data Streaming?

David Kjerrumgaard

Developer Advocate at StreamNative
Storage is a critical component of any real-time data streaming system, and the choice of storage model can significantly affect…

HTTP 3: Moving on From TCP

Brian Sletten

President at Bosatsu Consulting, Inc.
Any network class you have taken in the last thirty years will have highlighted that the application layer depends on…

Demanding the Impossible: Rigorous Database Benchmarking

Dmitrii Dolgov

Senior Software Engineer at Red Hat
It’s easy to conduct a misleading benchmark, and notoriously hard to design a correct and rigorous enough one. Have you…

Misery Metrics & Consequences

Gil Tene

CTO and Co-Founder of Azul Systems
Join Azul System’s Gil Tene as he defines “misery metrics,” which describe what happens when our production systems are operating…

Sharpening the Axe: The Primacy of Toolmaking

Bryan Cantrill

CTO of Oxide Computer Company
Oxide’s Bryan Cantrill weighs in on allowing engineers to make their own tools, resulting in better systems delivered faster and…

The Art of Macro Benchmarking: Evaluating Cloud Native Services Efficiency

Bartłomiej Płotka

Senior Software Engineer at Google
Benchmarking is hard, especially on a macro level that integrates multiple code components into one or multiple microservices. It’s challenging…

The Art of Event Driven Observability with OpenTelemetry

Henrik Rexed

Cloud Native Advocate at Dynatrace
Explore the various components of OpenTelemetry, examples of unuseful traces from event driven architecture, and the purpose/usage of span links…

P99 Pursuit: 8 Years of Battling P99 Latency

Dor Laor

CEO of ScyllaDB
ScyllaDB CEO Dor Laor covers principles for successful OSS projects like ScyllaDB, KVM, the Linux kernel and why they spurred…

From SLO to GOTY

Charity Majors

CTO of Honeycomb
Charity Majors shares the performance lessons we can all learn from game developers, who were among the first to run…

Linux Kernel vs DPDK: HTTP Performance Showdown

Marc Richards

Performance Engineer at Amazon Web Services
AWS’ Marc Richards uses an HTTP benchmark to compare performance of the Linux kernel networking stack with userspace networking doing…

Overcoming Variable Payloads to Optimize for Performance

Armin Ronacher

Creator of Flask and Principal Architect at Sentry
Hear from Sentry’s Armin Ronacher, creator of the Flask framework for Python, on how to optimize for performance when you…

Using eBPF for High-Performance Networking in Cilium

Liz Rice

Chief Open Source Officer, Isovalent at Cisco
Isovalent’s Liz Rice shows how and why Cilium bypasses the kernel using eBPF for Kubernetes and container orchestration networking, observability…

High-speed Database Throughput Using Apache Arrow Flight SQL

Kyle Porter

Architect at Dremio

James Duong

Architect at Dremio
Kyle Porter and James Duong of Bit Quill Technologies share how Flight SQL can push SQL query throughput beyond existing…

Square Engineering’s “Fail Fast, Retry Soon” Performance Optimization Technique

Omar Elgabry

Software Engineer at Square
Learn how to build resilient systems, reduce failure rates, and improve application latency by employing one of the techniques in…

Clouds are Not Free: Guide to Observability-Driven Efficiency Optimizations

Bartłomiej Płotka

Senior Software Engineer at Google
Red Hat’s Bartłomiej Płotka explains how to find and uncover efficiency problems effectively using the power of modern cloud-native observability…

How a Database Looks from a Disk’s Perspective

Avi Kivity

CTO and Co-Founder of ScyllaDB
ScyllaDB’s CTO Avi Kivity dives into how high performance distributed systems such as modern databases can make best, most efficient…

Measuring the CPU Performance of Android Apps at Lyft

Pavlo Stavytskyi

Google Developer Expert
Hear from Pavlo Stavytskyi on how Lyft measures CPU load to improve app performance. What metrics they collect, plus how…

Speedup Your Code Through Asynchronous Programing

Sabina Smajlaj

Operations Developer at Hudson River Trading
Hudson River Trading’s Sabina Smajlaj demonstrates how to take advantage of programming languages’ asynchronous libraries with a few minor tweaks…

Analyze Virtual Machine Overhead Compared to Bare Metal with Tracing

Steven Rostedt

Software Engineer at Google
Google’s Steve Rostedt discusses using tracing to analyze when the overhead from a Linux host running KVM is higher than…

A New IO Scheduler Algorithm for Mixed Workloads

Pavel Emelyanov

Principal Software Engineer at ScyllaDB
Discover how ScyllaDB, built on the highly asynchronous Seastar library, implemented an IO scheduler optimized for peak performance on modern…

Large-Scale, Semi-Automated Go Garbage Collection Tuning at Uber

Cristian Velazquez

Staff Site Reliability Engineer at Uber
Uber’s Cristian Velazquez talks about tuning garbage collection for Go to scale applications across 70,00 cores to maintain 30 mission-critical…

Why User-Mode Threads Are Good for Performance

Ron Pressler

Project Loom Technical Lead, Java Platform Group at Oracle
Hear from Oracle’s Ron Pressler how Java added virtual threads, an implementation of user-mode threads, to help write high-throughput servers.

Hardware Assisted Latency Investigations

Kshitij Doshi

Senior Principal Engineer, Intel Corportation

Harshad S Sane

Principal Software Engineer at Intel
Intel’s Harshad S Sane & Kshitij Doshi share new ways to use eBPF to better examine latency excursions.

Continuous Performance from Load Testing to SRE and Beyond

Leandro Melendez

Developer Advocate at Grafana Labs
Grafana k6’s Leandro Melendez explores how to use continuous methodologies, service structures, microservice tiers, cloud, and elasticity.

The Observant Developer — Continuous Feedback with OpenTelemetry

Roni Dover

CTO of Digma
Roni Dover shares practical ways that OpenTelemetry combined with open-source tools can be integrated into the modern development stack.

End-To-End Performance Testing, Profiling, and Analysis at Redis

Filipe Oliveira

Principal Performance Engineer at Redis
Learn how Redis developed an automated framework for performance regression testing, telemetry gathering, profiling, and data visualization upon code commit.

Keeping Latency Low for User-Defined Functions with WebAssembly

Piotr Sarna

Founding Engineer at poolside
Piotr Sarna describes how to integrate WebAssembly and Wasmtime into a C++ project in a latency-friendly manner by implementing UDFs…

Evaluating Performance In Go

William Kennedy

Managing Partner at Ardan Labs
William Kennedy provides a deep dive training on how to optimize Go’s concurrency and garbage collection.

How We Reduced Performance Tuning Time by Orders of Magnitude with Database Observability

Yuying Song

Database Performance Engineer at PingCAP
PingCap’s Database Performance Engineer Yuying will share how to measure latency in a distributed system using a top-down (holistic) approach,…

Implementing Highly Performant Distributed Aggregates

Michal Jadwiszczak

Software Engineer at ScyllaDB
ScyllaDB’s Michał Jadwiszczak explains how can you implement aggregate functions without hammering real-time availability and performance for other read/write operations.

Ultra-Low-Latency Web Rendering on the Edge

Malte Ubl

Chief Architect at Vercel
Vercel’s Malte Ubl will discuss the trade-offs of the new paradigm of rendering web pages in the edge, and look…

A Deep Dive into Query Performance

Peter Zaitsev

Founder of Percona, Coroot, FerretDB
Percona’s Peter Zaitsev explores overlooked and underappreciated ways to successfully establish a connection and get results to the queries promptly…

How Dashtable Helps Dragonfly Maintain Low Latency

Roman Gershman

Co-Founder of DragonflyDB
Roman Gershman explains how Dragonfly’s hastable implementation helps to keep its tail latency in check — including a look at…

Fast and Fault Tolerant

Michael Barker

Independent Consultant at Ephemeris Consulting
Michael Barker draws on knowledge from working on financial exchanges, messaging and clustering systems to describe a model that can…

Taming Go’s Memory Usage — and Avoiding a Rust Rewrite

Mark Gritter

Principal Engineer at Postman
Akita’s Mark Gritter goes against the current trends and describes why he and his team stuck with Golang and chose…

Tracking Syscall and Function Latency in Your k8s Cluster with eBPF

Matthew Lenhard

CTO of ContainIQ
ContainIQ’s Matthew Lenhard walks the audience through a real life performance tuning exercise, where we hunt down slow system calls…

Outrageous Performance: RageDB’s Experience with the Seastar Framework

Max De Marzi Jr.

Developer at RageDB
Learn how RageDB leveraged the Seastar framework to build an outrageously fast graph database in this talk by Max De…

Pitfalls in Writing High-Performance Systems in Rust

Marek Galovic

Staff Software Engineer at Pinecone
Pinecone’s Marek Galovic looks at common and maybe not so common pitfalls in writing high-performance distributed systems in Rust.

Why Kubernetes Freedom Requires Chaos Engineering to Shine in Production

Henrik Rexed

Cloud Native Advocate at Dynatrace
Dynatrace’s Henrik Rexed uses production methods and Kubernetes settings useful to avoid outages, from chaos engineering, to observability and load…

Testing Persistent Storage Performance in Kubernetes with Sherlock

Sagy Volkov

Distinguished Performance Architect at Lightbits Labs
Lightbits Labs’ Sagy Volkov demonstrates how to use Sherlock, an open source platform written to test persistent NVMe/TCP storage in…

Aggregator Leaf Tailer: Bringing Data to Your Users with Ultra Low Latency

Jeffery Utter

Staff Software Developer at theScore
Discover how and why theScore built Datadex, an aggregator leaf tailer system built for geographically distributed, low-latency queries and real-time…

Properly Understanding Latency is Hard — What We Learned When We Did it Correctly

Brian Taylor

Principle Software Engineer at Optimizely
Optimizely’s Brian Taylor applies lessons of Gil Tene’s coordinated omission talk to understand the surprising sources of latency found in…

Measuring P99 Latency in Event-Driven Architectures with OpenTelemetry

Antón Rodríguez

Principal Software Engineer at New Relic
New Relic’s Antón Rodríguez shows how Event-Driven Architectures can instrument apps using vendor-neutral APIs, libraries, and tools via OpenTelemetry.

C# as a System Language

Oren Eini

Founder & CEO of RavenDB
RavenDB’s Oren Eini discusses the features that make C# a viable system language for building high-end systems.

Retaining Goodput with Query Rate Limiting

Piotr Dulikowski

Senior Software Engineer, ScyllaDB
ScyllaDB’s Piotr Dulikowski walks through how they tackled a “hot partition” problem: a single partition accessed with disproportionate frequency that…

Improving Performance of Micro-Frontend Applications through Error Monitoring

Garrett Hamelin

Developer Advocate at Airbrake, a LogicMonitor Company
Airbrake’s Garret Hamelin walks you through some of the dos and don’ts for trying to reduce errors and improve performance…

It’s Time to Debloat the Cloud with Unikraft

Felipe Huici

CEO and Co-Founder of Unikraft UG
Felipe Huici introduces Unikraft, a cloud operating system that allows for easily building fully-tailored cloud-ready images that boot in a…

Building Efficient Multi-Threaded Filters for Faster SQL Queries

Vlad Ilyushchenko

Co-Founder and CTO at QuestDB
QuestDB’s Vlad Ilyushchenko will describe how they optimized their database performance using efficient zero garbage collection multithreaded query processing.

Performance Insights Into eBPF, Step by Step

Dmitrii Dolgov

Senior Software Engineer at Red Hat
Red Hat’s Dmitri Dolgov sheds light on using eBPF. How to collect execution metrics, profile programs and common pitfalls to…

cachegrand: A Take on High Performance Caching

Daniele Salvatore Albano

Senior Software Engineer II at Microsoft
Microsoft’s Daniele Salvatore Albano presents cachegrand, a SIMD-accelerated hashtable without locks or busy-wait loops using fibers, io_uring, and much more.

Throw Away Your Nines

Alex Hidalgo

Principal Reliability Advocate at Nobl9
You may encounter problems if you only think about “nines” setting service reliability targets. Throw away your nines. Let’s find…

The Role of Machine Learning In Cloud Native Performance Optimization

Brian Likosar

Global Director of Solutions Architecture at StormForge
StormForge’s Brian Likosar shows how machine learning can be used to optimally configure apps deployed in Kubernetes to ensure performance…

Capturing NIC and Kernel TX and RX Timestamps for Packets in Go

Blain Smith

Staff Software Engineer at Rocket Science
Rocket Science’s Blain Smith shows how to get better timestamp granularity from the NIC by directly sending and capturing data…

Cutting Through the Fog of Virtualization

Bernd Bandemer

Head of Data Science at Clockwork Systems Inc.
Clockwork Systems’ Bernd Bandemer details causes of cloud network latency, from its underlying infrastructure, to its physical topology and network…

Optimizing Servers for High-Throughput and Low-Latency at Dropbox

Alexey Ivanov

Software Engineer at Dapper Labs
Dapper Labs’ Alexey Ivanov explores layers of efficiency/performance optimizations from hardware, drivers, Linux kernel, library and application-level tunings.

Removing Implicit Deadlocks on a Thread-per-core Architecture with 2-phase Processing

Alex Gallego

CEO and Founder of Redpanda
Redpanda’s Alex Gallego will show how implicit limitations in asynchronous programming can be addressed by a 2-phase technique for resolving…

Apache Iceberg: An Architectural Look Under the Covers

Alex Merced

Developer Advocate at Dremio
Alex Merced, Developer Advocate at Dremio, describes the open data lakehouse architecture and performance-oriented capabilities of Apache Iceberg.

Three Perspectives on Measuring Latency

Geoffrey Beausire

Senior Site Reliability Engineer at Criteo
Discover from Criteo’s Geoffrey Beausire how to measures latency in key-value infrastructure from both server and client sides, as well…

Continuous Performance Regression Testing with JfrUnit

Gunnar Morling

Principal Software Engineer at Decodable
Gunnar Morling (Red Hat) explains how to use JfrUnit to track metrics that could impact application performance.

Realtime Indexing for Fast Queries on Massive Semi-Structured Data

Dhruba Borthakur

CTO of Rockset
Dhruba Borthakur (Rockset) explains how to combine lightweight transactions with real-time analytics to power a user-facing application.

OSNoise Tracer: Who Is Stealing My CPU Time?

Daniel Bristot de Oliveira

Principal Software Engineer at Red Hat
Daniel Bristot de Oliveira (Red Hat) explores operating system noise (the interference experienced by an application due to activities inside…

OSv Unikernel — Optimizing Guest OS to Run Stateless and Serverless Apps in the Cloud

Waldek Kozaczuk

OSv Committer
Waldek Kozaczuk talks about optimizing a guest OS to run stateless and serverless apps in the cloud for CNN’s video…

New Ways to Find Latency in Linux Using Tracing

Steven Rostedt

Software Engineer at Google
Steven Rostedt dives into new flexible and dynamic aspects of ftrace that can help expose latency issues.

How to Measure Latency

Heinrich Hartmann

Principal Engineer at Zalando
Heinrich Hartmann (Zalando) shares strategies for avoiding pitfalls with collecting, aggregating and analyzing latency data for monitoring and benchmarking.

Rust Is Safe. But Is It Fast?

Glauber Costa

Founder & CEO of Turso
Glauber Costa outlines pitfalls and best practices for developing Rust applications with low P99.

G1: To Infinity and Beyond

Stefan Johansson

Principle Member of Technical Staff at Oracle
Stefan Johansson (Oracle) provides insights on the G1 JVM garbage collector — what’s new, how it impacts performance, and what’s…

I/O Rings and You — Optimizing I/O on Windows

Yarden Shafir

Software Engineer at Crowdstrike
Yarden Shafir (Crowdstrike) introduces Windows’ implementation of I/O rings, demonstrating how it’s used, and discusses potential future additions.

Data Structures for High Resolution, Real-time Telemetry at Scale

Filipe Oliveira

Performance Engineer at Redis
Felipe Oliveira (Redis) explains how to use several OSS data structures to incorporate telemetry features at scale… and why they…

Scaling Apache Pulsar to 10 Petabytes/Day

Karthik Ramasamy

Senior Director of Engineering at Splunk
Karthik Ramaswamy (Splunk) demonstrates how data — including logs and metrics — can be processed at scale and speed with…

RISC-V on Edge: Porting EVE and Alpine Linux to RISC-V

Kathy Giori

Ecosystem Engagement Lead at ZEDEDA

Roman Shaposhnik

Co-Founder of ZEDEDA Inc.
Roman and Kathy share their experience porting Alpine Linux and LF Edge EVE-OS to the new RISC-V architecture

Extreme HTTP Performance Tuning: 1.2M API req/s on a 4 vCPU EC2 Instance

Marc Richards

Performance Engineer at Talawah Solutions
Marc Richards shares the performance tuning steps that he took to serve 1.2M JSON requests per second from a 4…

Is It Faster to Go with Redpanda Transactions than Without Them?!

Denis Rystsov

Staff Engineer at Vectorized
Denis Rystsov shares how Redpanda optimized the Kafka API and pushed throughput of distributed transactions up to 8X beyond an…

Crimson: Ceph for the Age of NVMe and Persistent Memory

Orit Wasserman

Architect at Red Hat
Orit Wasserman (Red Hat) talks about implementing Seastar, a highly asynchronous engine as a new foundation for the Ceph distributed…

Performance Analysis and Troubleshooting Methodologies for Databases

Peter Zaitsev

CEO and Co-Founder of Percona
Peter Zaitsev (Percona) presents 3 performance analysis approaches + explained the best use cases for each.

Seastore: Next Generation Backing Store for Ceph

Sam Just

Senior Principal Software Engineer at Red Hat
Sam Just (Red Hat) shares how they architected their next-generation distributed file system to take advantage of emerging storage technologies…

Object Compaction in Cloud for High Yield

Tejas Chopra

Senior Software Engineer at Netflix
Tejas Chopra shares how Netflix gets massive volumes of media assets and metadata to the cloud fast and cost-efficiently.

Where Did All These Cycles Go?

Thomas Dullien

CEO of optimyze.cloud Inc.
Thomas Dullien (Optimyze.cloud) exposed all the hidden places where you can recover your wasted CPU resources.

Get Lower Latency and Higher Throughput for Java Applications

Simon Ritter

Deputy CTO at Azul Systems
Simon Ritter (Azul Systems) offers strategies for hitting p99 SLAs in Java — despite the various challenges presented by the…

What We Need to Unlearn about Persistent Storage

Pavel Emelyanov

Principal Software Engineer at ScyllaDB
Pavel Emelyanov (ScyllaDB) talks about ways to measure the performance of modern hardware and what it all means for database…

Avoiding Data Hotspots at Scale

Konstantin Osipov

Director of Software Engineering at ScyllaDB
Konstantine Osipov (ScyllaDB) addresses the tradeoffs between hash and range-based sharding.

Keeping Latency Low and Throughput High with Application-level Priority Management

Avi Kivity

CTO and Co-Founder of ScyllaDB
ScyllaDB CTO and co-founder Avi Kivity shows how high throughput and low latency can both be achieved in a single…

Using eBPF to Measure the k8s Cluster Health

Henrik Rexed

Cloud Native Advocate at Dynatrace
Henrik Rexed (Dynatrace) explains how to use Prometheus + eBPF to understand the inner behavior of Kubernetes clusters and workloads…

Continuous Go Profiling & Observability

Felix Geisendörfer

Staff Engineer at Datadog
Felix Geisendörfer (Datadog) digs into the unique aspects of the Go runtime and interoperability with tools like Linux perf and…

Unikraft: Fast, Specialized Unikernels the Easy Way

Felipe Huici

Chief Researcher at NEC Europe Laboratories GmbH
Felipe Huici (NEC Laboratories Europe) showcases the utility and design of UnikraftSDK.

Understanding Apache Kafka P99 Latency at Scale

Pere Urbón-Bayes

Senior Solutions Architect at Confluent
Pere Urbón-Bayes (Confluent) presents strategies for measuring, evaluating, and optimizing the performance of an Apache Kafka-based infrastructure.

High-Performance Networking Using eBPF, XDP, and io_uring

Bryan McCoid

Sr. Distributed Systems Engineer, Couchbase Inc.
Bryan McCoid outlines the ins and outs of Linux kernel tools such as io_uring, eBPF, and AF_XDP and how to…

DB Latency Using DRAM + PMem in App Direct & Memory Modes

Doug Hood

Consulting Member of Technical Staff at Oracle
Doug Hood (Oracle) compares the latency of DDR4 DRAM to that of Intel Optane Persistent Memory for in-memory database access.

Rust, Wright’s Law, and the Future of Low-Latency Systems

Bryan Cantrill

CTO of Oxide Computer Company
Bryan Cantrill on the rise of Rust-based systems, and the ceding of Moore’s Law to Wright’s Law and explain why…

Whoops! I Rewrote It in Rust

Brian Martin

Software Engineer at Twitter
Why and how Brian Pelikan rewrote Pelikan, Twitter’s open source and modular framework for in-memory caching, in Rust.

Let’s Fix Logging Once and for All

Peter Portante

Senior Principal Software Engineer at Red Hat
Peter Portante (Red Hat) presents a Linux kernel modification that gives the SRE and logging source owner greater control over…

Using SLOs for Continuous Performance Optimizations of Your k8s Workloads

Andreas Grabner

DevOps Activist at Dynatrace
Andreas Grabner (Dynatrace) shares how to use the CNCF Keptn project to automate SLO-based Performance Analysis as part of your…

Vanquishing Latency Outliers in the Lightbits LightOS Software Defined Storage System

Abel Gordon

Chief Systems Architect at Lightbits Labs
Abel Gordon’s overview on how Lightbits LightOS improves latency of high performance low latency NVMe based storage accessed over standard…