Rust has been a hot topic at P99 CONF since day 1 – literally. From Brian Martin opening up the first-ever P99 CONF with, Whoops! I Wrote it in Rust!, to Glauber Costa’s enlightening session Rust Is Safe. But Is It Fast?, to Brian Cantrill’s turbocharged take on Rust, Wright’s Law, and the Future of Low-Latency Systems, Rust has earned and defended its position as a top topic. And given the conference’s focus on low-latency engineering strategies, that’s not surprising.
But what is surprising is the amazing lineup of Rust speakers and topics we’ll be sharing with the community at P99 CONF 2024. In case you’re new to P99 CONF, it’s a free 2-day community event for engineers obsessed with low-latency engineering strategies and performance optimization. It’s intentionally virtual, highly interactive, and purely technical.
Here’s a sneak peek into some of the Rust-focused talks we’ll be featuring – as well as several other (Zig, C++, Go, Java ….) talks that might be interesting to even the most resolute Rustaceans.
Rust Tech Talks
There’s a rather wide array of Rust talks on this year’s agenda…
Rust: A productive language for writing database applications
Carl Lerche, Principal Engineer at AWS [and Tokio developer]
When you think about Rust, you might think of performance, safety, and reliability, but what about productivity? Last year, I recommended considering Rust for developing high-level applications. Rust showed great promise, but its library ecosystem needed to mature. What has changed since then? Many higher-level applications sit on top of a database. In this talk, I will explore the current state of Rust libraries for database access, focusing on ergonomics and ease of use—two crucial factors in high-level database application development.
Rust + io_uring + ktls: How Fast Can We Make HTTP?
Amos Wenger, Writer & Video Maker aka @fasterthanlime
I’ve been working on an HTTP1+2 implementation in async Rust, using io_uring and ktls, as open-source, sponsored by companies like fly.io and Shopify, named loona. All existing Rust HTTP implementations are having a hard time adopting io_uring, because their IO types and buffer management strategy is fundamentally incompatible with it: their only path forward (besides a comprehensive rewrite) is a series of compromises that would negate io_uring performance improvements.
fluke is written from the ground up to take advantage of everything io_uring has to offer, and can be run within a tokio current_thread executor. To further reduce system calls (which have gotten more expensive over time, with each hardware-level security mitigations), fluke integrates with kTLS (TLS done in-kernel).
In the future, I expect fluke to be a great choice for proxy software (including Kubernetes ingress), and possibly even application software if a sufficiently friendly & stable API design emerges. At the time of this submission, loona implements a lot of http/1 and http/2 correctly, and I’m eager to report the results of performance testing at P99 CONF.
Writing a Kernel in Rust: Code Quality and Performance
Luc Lenôtre, Site Reliability Engineer at Clever Cloud
Maestro is a kernel that started as a small school project. Initially written in C, the project then switched to Rust to improve code quality. The project is currently in a clean-up and performance improvement phase, and this talk summarizes the lessons learned from it.
Latency, Throughput & Fault Tolerance: Designing the Arroyo Streaming Engine
Micah Wylde, Co-founder at Arroyo
Arroyo is a distributed, stateful stream processing engine written in Rust. It combines predictable millisecond-latency processing with the throughput of a high-performance batch query engine—on top of a distributed checkpointing implementation that provides fault tolerance and exactly-once processing.
These design goals are often in tension: increasing throughput generally comes at the expense of latency, and consistent checkpointing can introduce periodic latency spikes while we wait for alignment and IO.
In this talk, I will cover the distributed architecture and implementation of Arroyo including the core Arrow-based dataflow engine, algorithms for stateful windowing and aggregates, and the Chandy-Lamport inspired distributed checkpointing system.
The Performance Engineer’s Toolkit: A Case Study on Data Analytics with Rust
Will Crichton, Assistant Professor at Brown University
I optimized a Python data analytics pipeline in my research to make it 180,000x faster using Rust. This speedup spanned the gamut of performance techniques: compiler optimizations, data structure selection, vectorization, parallelization, and more. In this talk, I will use this case study to explain each technique, and give you a better sense of the tools in a performance engineer’s toolkit.
Performance Pitfalls of Rust Async Function Pointers (And Why It Might Not Matter)
Byron Wasti, Founder of Balter Load Testing
An in-depth analysis of asynchronous function pointers in Rust, why they aren’t a real thing (compared to normal function pointers) and a performance analysis of each way of constructing them. From Boxed Async functions, to Enum dispatch to StackFutures.
Low-Latency Mesh Services Using Actors
Nikita Lapkov, Senior Rust Engineer
The talk will be about how we transformed our actor system called elfo (https://github.com/elfo-rs/elfo) into a distributed mesh of services.
Elfo started out as an async Rust actor system, where all actors lived on a single node. It was created to serve extremely I/O-heavy workloads of high-frequency trading industry, with focus on developer ergonomics and performance. As the trading business grew, single-node deployment no longer satisfied the latency requirements when connected to different exchanges. From that, the need for distributed deployment arose.
The way messages are delivered is opaque to actors, since they use API provided by elfo for that. All messages are also defined as Rust structs, which we have complete control over. This means that if we “just” implemented a network transport for delivering messages, two actors living on different nodes could talk to each other as if they were on the same node.
The reality is, of course, not so simple. The talk will dive deep into how we chose multiple formats for message serialisation, implemented compression of messages while balancing between compression ratio and the latency, implemented back-pressure to avoid fast actors overwhelming the slow ones. The talk will also include how we leverage total control of the transport to make everything observable and debuggable.
WebAssembly on the Edge: Sandboxing AND Performance
Brian Sletten (Consultant and Author of WebAssembly: the Definitive Guide) and Ramnivas Laddad (Co-founder, Exograph and Author of AspectJ in Action)
Moving applications to the Edge often complicates conventional performance techniques due to security constraints. Based on actual experiences moving Exograph-based (https://exograph.dev) applications into edge computing environments, we will highlight some strategies for improving performances within the limitations of sandboxed WebAssembly-based environments. This will include discussions of how Wasi-advancements and the new component model can assist balancing these two goals which are often at odds.
But Maybe Rust Isn’t Always the Answer?
Of course, Rust isn’t the only option for low-latency programming. P99 CONF will also feature sessions on rising Rust contender, Zig, as well as low-latency C++, Go, and even Java. For example:
- 1BRC – Nerd Sniping the Java Community” (Gunnar Morling): Some of the tricks employed by the fastest solutions for the “One Billion Row Challenge” #1BRC challenge that went viral earlier this year
- One Billion Row Challenge in Golang (Shraddha Agrawal): Solving the #1BRC Golang – using Golang’s performance tools to reduce the execution time of processing a 16GB file from 6mins to ~12sec.
- Just In Time LSM Compaction (Aleksei Kladov): A deep dive into (Zig-based) TigerBeetle’s compaction algorithm – “garbage collection” for LSM
- Speed by the Numbers: Text Encoding in C and C++ (JeanHeyd Meneide): Let’s see how we can have a generic, powerful text encoding library in languages like C and C++, WITHOUT losing the performance of a highly specialized library.
Even crabby Rustaceans might want to attend these sessions for a bit of discovery, not to mention friendly debate. 😉