P99 CONF 24: The Wait is Over

Share This Post

P99 CONF community, we know you hate delays. Let’s start with the key points:

P99 CONF will occur October 23 and 24 – free and virtual, as always
You won’t believe the lineup
You can register now

Now let’s back up and unravel some of the details.

What is P99 CONF all about?

In one word, performance. Latency-obsessed engineers from around the world come together for P99 CONF to share their latest experiments, optimizations, ideas, and lessons learned. It’s purely technical, intentionally virtual, and highly interactive. In the open source spirit, the event is collaborative and free. And registration just opened.

Watch P99 CONF On- Demand

ScyllaDB created P99 CONF in 2021 to connect and foster the community of technologists who obsess over low-latency engineering. Each year, we’ve worked with that community to feature new perspectives, cover a broader spectrum of performance topics, and even try out some new session formats – like last year’s two wildly popular live panels:

But just like the P99 CONF community, we’re obsessed with continually optimizing. And that brings us to P99 CONF 2024.

What can you expect for P99 CONF 24?

The agenda is still being finalized, but we hope you’ll share our excitement about the speakers we just announced:

Michael Stonebraker, Postgres creator and MIT Professor
Gunnar Morling, Decodable engineer and 1B Row Challenge leader
Andy Pavlo, CMU professor and co-founder of Ottertune
Avi Kivity, KVM creator, ScyllaDB co-founder and CTO
Amos Wenger, the human behind the Fastethanlime blog
Ashley Williams, Axo founder/CEO, Rust core team, Rust Foundation founder
Pekka Enberg, Turso CTO, ScyllaDB contributor, and “Latency” book author
Carl Lerche, Tokio creator, Rust contributor, and Engineer at AWS
Liz Rice, Chief open source officer with eBPF specialists Isovalent
Bryan Cantrill, Co-founder and CTO of Oxide Computer

All together – and free. Did you register yet?

You can also look forward to engineering talks on performance optimizations at Shopify, Lyft, Uber, Disney/Hulu, Netflix, Turo, ShareChat, Zoo.dev, Datadog, Grafana, TigerBeetle, ScyllaDB, and more.

There will be parallel tracks of sessions covering topics such as:

Database optimizations
Rust, Zig, Go, C, C++, Wasm
eBPF, io_uring, kernel
AI/ML and LLMs
Kubernetes
Observability
Cloud infrastructure

3 Session Sneak Peeks: Amos Wenger, Andy Pavlo, Michael Stonebraker

We’ll be announcing more speakers and revealing talk topics soon. But here’s a sneak peek at 3 sessions that promise to set the chat on fire.

Amos Wenger: Rust + io_uring + ktls: how fast can we make HTTP?

I’ve been working on an HTTP1+2 implementation in async Rust, using io_uring and ktls, as open-source, sponsored by companies like fly.io and Shopify, named fluke.

All existing Rust HTTP implementations are having a hard time adopting io_uring because their IO types and buffer management strategy is fundamentally incompatible with it: their only path forward (besides a comprehensive rewrite) is a series of compromises that would negate io_uring performance improvements.

fluke is written from the ground up to take advantage of everything io_uring has to offer, and can be run within a tokio current_thread executor. To further reduce system calls (which have gotten more expensive over time, with each hardware-level security mitigations), fluke integrates with kTLS (TLS done in-kernel).

In the future, I expect fluke to be a great choice for proxy software (including Kubernetes ingress), and possibly even application software if a sufficiently friendly & stable API design emerges.

Andy Pavlo: The Next Chapter in the Sordid Love/Hate Relationship Between Databases and Operating Systems

Database management systems (DBMSs) are beautiful, free-spirited software that want nothing more than to help users store and access data as quickly as possible. To achieve this goal, DBMSs have spent decades trying to avoid operating systems (OSs) at all costs. Such avoidance is necessary because OSs always try to impose their will on DBMSs and stifle their ambitions through disingenuous syscall semantics, unscalable kernel-level data structures, and excessive data copying.

The many attempts to avoid the OS through kernel-bypass methods or custom hardware have such high engineering/R&D costs that few DBMSs support them. In the end, DBMSs are stuck in an abusive relationship: they need the OS to run their software and provide them with basic functionalities (e.g., memory allocation), but they do not like how the OS treats them. However, new technologies like eBPF, which allow DBMSs to run custom code safely inside the OS kernel to override its functionality, are poised to upend this power struggle.

In this talk, I will present a new design approach called “user-bypass” for building high-performance database systems and services with eBPF. I will discuss recent developments in eBPF relevant to the DBMS community and what parts of a DBMS are most amenable to using it. We will also present the design of BPF-DB, an embedded DBMS written in eBPF that provides ACID transactions over multi-versioned data and runs entirely in the Linux kernel.

Michael Stonebraker: You’re Doing it All Wrong

In this talk, we consider business data processing applications, which have historically been written for a three-tier architecture. Two ideas totally upset this applecart.

Idea #1: The Cloud
All enterprises are moving everything possible to the cloud as quickly as possible. In this new environment, you are highly encouraged to use a cloud-native architecture, whereby your system is composed of distributed functions, working in parallel, and running on a serverless (and stateless) platform like AWS Lambda or Azure Functions. You program your application as a workflow of “steps.” To make systems resilient to failures, you require a separate state machine and workflow manager (e.g., AWS Step Functions, Airflow, etc.). If you use this architecture, then you don’t pay for resources when your application is idle, often a major benefit. Depending on the platform, you may also get automatic resource elasticity and load balancing; additional major benefits.

Idea #2: Leverage the DBMS
Obviously, your data belongs in a DBMS. However, by extension, so does the state of your application. Keeping track of application state in the DBMS allows one to provide once-and-only-once execution semantics for your workflow. One can also use the database concept of “sagas” to allow multi-transaction applications to be done to completion or not at all.

Furthermore, to go an order of magnitude faster that AWS Lambda, you need to collocate your application and the DBMS. The fastest alternative is to run your application inside the DBMS using stored procedures (SPs). However, it is imperative to overcome SP weaknesses, specifically the requirement of a different language (e.g.PL/SQL) and the absence of a debugging environment. The latter can be accomplished by persisting the database log and allowing “time travel debugging” for SPs. The former can be supported by coding SPs in a conventional language such as Typescript.

Extending this idea to the operating environment, one can time travel the entire system, thereby allowing recovery to a previous point in time when disasters happen (errant programs, adversary intrusions, ransomware, etc.).

I will discuss one such platform (DBOS) with all of the above features. In my opinion, this is an example of why “you are doing it all wrong”.

***

Don’t miss the spicy debate as the conference goes on October 23-24!