eBPF Tech Talks at P99 CONF 2025

Share This Post

See how engineers are pushing eBPF to its limits

eBPF talks have set the P99 CONF chat abuzz every year since the conference’s inception in 2021. Speakers have touched on how to use eBPF for latency analysis, high performance networking, noisy neighbor detection, K8s observability — even to make databases faster by executing common database operations in the kernel.

Let’s take a look back at some popular eBPF talks from P99 CONF 2024, then forward to three eBPF talks we’ll be featuring at P99 CONF 2025.

Join P99 CONF (Free + Virtual)

(In case you’re new to P99 CONF, it’s a free 2-day community event for engineers obsessed with low-latency engineering strategies and performance optimization. It’s intentionally virtual, highly interactive, and purely technical.)

What You Missed at P99 CONF 2024

Last year, speakers highlighted eBPF’s growing role in tackling real production challenges, from container networking and noisy neighbor detection to latency tracing and database performance analysis.

Noisy Neighbor Detection with eBPF

Jose Fernandez (Netflix)

In multi-tenant environments, performance issues often arise from the “noisy neighbor” problem, where one container’s excessive CPU usage degrades the performance of adjacent containers. At Netflix, we’ve developed a low-overhead solution leveraging eBPF to continuously instrument the Linux scheduler and detect these issues in real time.

This talk explores how eBPF is used to monitor run queue latency, associate process IDs with cgroup IDs, and emit actionable metrics. Watch it for insights into the implementation details, optimization techniques for eBPF code, and how this approach ensures high performance and reliability in a shared compute environment.

Scheduler Tracing With ftrace + eBPF

Jason Rahman (Microsoft)

Understanding application latency requires understanding the underlying layers of the system. The operating system scheduler is one of those layers which impacts application latency and performance. In this talk, I share how to leverage both ftrace and eBPF (along with Perfetto for visualization) to capture the runtime behavior of the Linux scheduler. Along the way, you will explore some interesting quirks (arguably bugs) in the existing CFS scheduler, and also begin exploring the new EEVDF scheduler appearing in recent Linux kernels.

Zero-Overhead Container Networking with eBPF and Netkit

Liz Rice (Isovalent at Cisco)

Netkit is a new enhancement to eBPF. It replaces the virtual Ethernet (veth) links that previously connected containers to the network namespace of their host. Until now, the overhead of veth connections meant that containerized applications could not communicate as quickly as if they were running directly on the host. In this talk, I share how Netkit and other eBPF-enabled capabilities now allow container networking to run as fast as host networking.

Using eBPF Off-CPU Sampling to See What Your DBs are Really Waiting For

Tanel Poder

At P99 CONF 23, I introduced the general concept of using eBPF-populated Task State Arrays to keep track of all Linux applications’ (including database engines) thread states and activity without relying on the built-in instrumentation of the application. For example, the “wait events” built into database engines are not perfect; some voluntary waits (system calls) are not properly instrumented in all database engines. There are also other involuntary waits caused by OS-level issues, like memory allocation stalls, CPU queuing, and task scheduler glitches. This year, I show the latest eBPF-based “xcapture” tool in practical use, measuring where MySQL, Postgres, and DuckDB really spend their time, both when on CPU and sleeping. All this can be done without having to change any source code of the database engine or applications running on it.

The Next Chapter in the Sordid Love/Hate Relationship Between DBs and OSes

Andy Pavlo (Carnegie Mellon University)

Database management systems (DBMSs) are beautiful, free-spirited software that want nothing more than to help users store and access data as quickly as possible. To achieve this goal, DBMSs have spent decades trying to avoid operating systems (OSes) at all costs. Such avoidance is necessary because OSs always try to impose their will on DBMSs and stifle their ambitions through disingenuous syscall semantics, unscalable kernel-level data structures, and excessive data copying.

The many attempts to avoid the OS through kernel-bypass methods or custom hardware have such high engineering/R&D costs that few DBMSs support them. In the end, DBMSs are stuck in an abusive relationship: they need the OS to run their software and provide them with basic functionalities (e.g., memory allocation), but they do not like how the OS treats them. However, new technologies like eBPF, which allow DBMSs to run custom code safely inside the OS kernel to override its functionality, are poised to upend this power struggle.

In this talk, I present a new design approach called “user-bypass” for building high-performance database systems and services with eBPF. I discuss recent developments in eBPF relevant to the DBMS community and what parts of a DBMS are most amenable to using it. And I present the design of BPF-DB, an embedded DBMS written in eBPF that provides ACID transactions over multi-versioned data and runs entirely in the Linux kernel.

What’s in Store for P99 CONF 2025

This year, speakers will focus on using eBPF for thread-level observability, concurrency testing, and reliable, memory-efficient eBPF instrumentation.

xCapture v3: Efficient, Always-On Thread Level Observability with eBPF

Tanel Poder, Long-time computer performance geek

xCapture is an eBPF-based Linux thread activity measurement tool for systematic performance troubleshooting and optimization that can go very deep. It captures thread activity of all apps in your system, including Linux kernel threads. It measures wall-clock time and off-CPU, not only CPU usage. Additionally it does event latency sampling for syscalls and IO for every thread. It is designed to run with a low overhead for advanced troubleshooting scenarios and with no overhead during basic usage. Thanks to modern eBPF task iterators, xCapture’s passive sampling mode does not have to inject probes to execution paths of other tasks. For itself, xCapture uses at least an order of magnitude less CPU time compared to running “top.”

Note: Tanel just announced the beta of xCapture v3 + xtop. At P99 CONF, he’ll launch the prod/GA ready version for everyone to use and demonstrate how it helps with real life troubleshooting scenarios.

Concurrency Testing using Custom Linux Schedulers

Johannes Bechberger, OpenJDK developer at SAP SE and Jake Hillion, Software Engineer at Meta

Consider a concurrency bug in your service that occurs once in a million machine hours of production runtime. How do you go about debugging it? And what if the interactions with the rest of the system—not just your application—are responsible?

With the new scheduler extensions in the Linux kernel, we can write custom schedulers in eBPF to help tackle this problem. We demonstrate a scheduler designed to make concurrency bugs more likely, increasing their frequency so you can focus profiling and debugging on fewer machines in a shorter timeframe.

In this talk, we show how to implement your own scheduler, how our approach enables whole-machine concurrency testing, and share some of the bugs we’ve found and fixed using this technique.

Note: For background on this project, read Johannes’ blog, Hello eBPF: Concurrency Testing using Custom Linux Schedulers.

Reliability vs Memory Efficiency of eBPF Instrumentation: Why Not Both?

Dom Delnano, Pixie Core Maintainer

Zero-instrumentation eBPF tooling promises engineers they can skip worrying about how to collect data and jump straight to insights. Delivering on that promise is a difficult engineering challenge: eBPF probes and user space processing must remain rock-solid across kernels, library versions and programming languages while adding minimal overhead. An incorrect implementation that serves bad data, or silently drops it, erodes trust; however, a correct but bloated implementation ends up too heavy to run in production – either failure is a deal breaker.

CNCF Pixie, an eBPF based Kubernetes observability system, was engineered with reliability first, yet there are times where this reliability comes with memory costs that can interfere with memory availability on the host system.

This talk tells the story of how Pixie was used to profile its own memory use and surface optimization opportunities to reduce its footprint. Throughout this investigation, we identified Pixie’s DWARF-intensive Uprobes as an opportunity to break the reliability vs. efficiency trade-off. The redesign preserved probe stability, slashed memory usage, and opened the door for Pixie to bring its reliability-first mindset to the OpenTelemetry project. Attendees will learn how to leverage pprof directly in their application, design patterns for memory-lean and correct Uprobe instrumentation and how to balance reliability and efficiency instead of choosing between them.

Extra credit: Read about How Pixie Uses eBPF.