P99 CONF 2023 is now a wrap! You can (re)watch all videos and access the decks now.
P99 CONF is a (free + online) highly-technical conference for engineers who obsess over P99 percentiles and long-tail latencies. The open source, community-focused event is hosted by ScyllaDB, the company behind the monstrously fast and scalable NoSQL database (and the adorable one-eyed sea monster).
Since database performance is so near and dear to us at ScyllaDB, we quite eagerly reached out to our friends and colleagues across the community to ensure a wide spectrum of distributed data systems, approaches, and challenges would be represented at P99 CONF. This year’s agenda covers SQL and NoSQL, ORMs, tuning, infrastructure, event-driven architectures, edge DBs, AI/ML feature stores, drivers, benchmarking, tracing, Raft, tablets, and much more.
If you share our obsession with high-performance low-latency data systems, here’s a rundown of sessions to consider watching at P99 CONF 2023.
A Deterministic Walk Down TigerBeetle’s main() Street
Aleksei Kladov (TigerBeetle)
Dive into how TigerBeetle used Zig to implement a fully deterministic distributed system that will never fail with an out of memory error, for predictable performance and 700x faster tests!
Ingesting in Rust
Armin Ronacher (Sentry)
Hear about building a Rust based ingestion service that handles hundreds of thousands of events per second with low latency globally.
Taming P99 Latencies at Lyft: Tuning Low-Latency Online Feature Stores
Bhanu Renukuntla (Lyft)
Explore the challenges and strategies of tuning low latency online feature stores to tame P99 latencies, shedding light on the importance of choosing the right data model.
The History of Tracing Oracle
Cary Millsap (Method R Corporation)
Delve into the history of tracing Oracle, why it has been overlooked despite its usefulness, and examples of how Oracle traces can help improve performance across your whole technology stack.
Building Low Latency ML Systems for Real-Time Model Predictions at Xandr
Chinmay Abhay Nerurkar Moussa Taifi (Microsoft)
Learn about the challenges of building an ML system with the low latency required to support the high volume and high throughput demands of ad serving.
Cost-Effective Burst Scaling For Distributed Query Execution
Dan Harris (Coralogix)
A case study in building a distributed execution model that can dynamically execute across both AWS Lambda and EC2 resources – shedding excess load to lambda functions to preserve low latency while scaling EC2 capacity to manage costs.
Cache Me If You Can: How Grafana Labs Scaled Up Their Memcached 42x & Cut Costs Too
Danny Kopping (Grafana Labs)
How Grafana Labs managed to increase their cache size by 42x and reduce costs by using a little-known feature of memcached called “extstore”.
Segment-Based Storage vs. Partition-Based Storage: Which is Better for Real-Time Data Streaming?
David Kjerrumgaard (StreamNative)
Explore key differences between segment-based and partition-based storage models (including how data is organized, stored, and accessed) with an eye toward what’s best for real-time data streaming system performance, scalability, and resiliency.
Demanding the Impossible: Rigorous Database Benchmarking
Dmitrii Dolgov (Red Hat)
An analysis of how to design an effective database benchmark, including selecting a mode, overcoming technical challenges, and analyzing the results (using PostgreSQL as an example).
Quantifying the Performance Impact of Shard-per-core Architecture
Dor Laor (ScyllaDB)
Most software isn’t architected to take advantage of modern hardware. How does a shard-per-code and shared-nothing architecture help – and exactly what impact can it make? Dor will examine technical opportunities and tradeoffs, as well as disclose the results of a new benchmark study.
Writing Low Latency Database Applications Even if Your Code Sucks
Glauber Costa (Turso)
How – by putting data close to its users –you can save hundreds of milliseconds and still be faster than the most optimized code … even if your code sucks.
ORM is Bad, But is There an Alternative?
Henrietta Dombrovskaya (DRW)
Why optimizing the application/database interaction is important and how the No-ORM framework provides an escape to common ORM pitfalls while maintaining their ease of use.
The Art of Event Driven Observability with OpenTelemetry
Henrik Rexed (Dynatrace)
Explore the various components of OpenTelemetry, examples of unuseful traces from event driven architecture, and the purpose/usage of span links in event driven architecture.
From 1M to 1B Features Per Second: Scaling ShareChat’s ML Feature Store
Ivan Burmistrov and Andrei Manakov (Sharechat)
A case study in building a low latency ML feature store (using ScyllaDB, Golang, and Flink) that handles 1B features per second, including data modeling tips for performance & scalability and caching strategies.
Distributed System Performance Troubleshooting Like You’ve Been Doing It for 20 Years
Jon Haddad (Rustyrazorblade Consulting)
Discover how to go about diagnosing performance problems in complex distributed systems and learn the tools and processes for getting to the bottom of any issue, quickly – even when it’s one of the biggest distributed database deployments on the planet.
Low-Latency Data Access: The Required Synergy Between Memory & Disk
Kriti Kathuria (University of Waterloo)
Learn about the synergy required between memory and disk to achieve efficient data processing and the general techniques that databases use for efficient data storage and retrieval.
Square’s Lessons Learned from Implementing a Key-Value Store with Raft
Omar Elgabry (Square)
The micro-lessons engineers can learn from Square’s experience building fault-tolerant, strongly consistent distributed systems using Raft.
MySQL Performance on Modern CPUs: Intel vs AMD vs ARM
Peter Zaitsev (Percona)
Look into the current CPU choices through a MySQL lens: which CPUs provide the best performance for single-threaded and high-concurrency workloads and which help to achieve the best price/performance.
Conquering Load Balancing: Experiences from ScyllaDB Drivers
Piotr Grabowski (ScyllaDB)
Get insight into the intricacies of load balancing within ScyllaDB drivers with Piotr sharing how we employed the Power of Two Choices algorithm, optimized the implementation of load balancing in Rust Driver, and more.
5 Hours to 7.7 Seconds: How Database Tricks Sped up Rust Linting Over 2000X
Predrag Gruevski (Trustfall)
A case study about using database ideas to build a linter that looks for breaking changes in Rust library APIs.
Adventures in Thread-per-Core Async with Redpanda and Seastar
Travis Downs (Redpanda)
A look at the practical experience of building high performance systems with C++20 in an asynchronous runtime, the unexpected simplicity that can come from strictly mapping data to cores, and the challenges & tradeoffs in adopting a thread-per-core architecture.
Automatically Sharding and Scaling-out Databases on Kubernetes
Trista Pan (SphereEx)
New ways to create a distributed/sharding database system based on your existing monolithic databases – without exacerbating data management, auto-scaling, and query performance issues.
Mitigating the Impact of State Management in Cloud Stream Processing Systems
Yingjun Wu (RisingWave Labs)
How RisingWave Labs is addressing the high latency issues associated with S3 storage in stream processing systems that employ a decoupled compute and storage architecture.
Unconventional Methods to Identify Bottlenecks in Low-Latency and High-Throughput Data Pipelines
Zamir Paltiel (Hyperspace)
Standard profiling and monitoring methods may fall short in identifying bottlenecks in low-latency data ingestion workflows; discover unconventional techniques to apply instead.