Low-Latency Distributed Data Strategies Featured at P99 CONF 23

Share This Post

P99 CONF 2023 is now a wrap! You can (re)watch all videos and access the decks now.

P99 CONF is a (free + online) highly-technical conference for engineers who obsess over P99 percentiles and long-tail latencies. The open source, community-focused event is hosted by ScyllaDB, the company behind the monstrously fast and scalable NoSQL database (and the adorable one-eyed sea monster).

Since database performance is so near and dear to us at ScyllaDB, we quite eagerly reached out to our friends and colleagues across the community to ensure a wide spectrum of distributed data systems, approaches, and challenges would be represented at P99 CONF. This year’s agenda covers SQL and NoSQL, ORMs, tuning, infrastructure, event-driven architectures, edge DBs, AI/ML feature stores, drivers, benchmarking, tracing, Raft, tablets, and much more.

If you share our obsession with high-performance low-latency data systems, here’s a rundown of sessions to consider watching at P99 CONF 2023.

A Deterministic Walk Down TigerBeetle’s main() Street

Aleksei Kladov (TigerBeetle)

Dive into how TigerBeetle used Zig to implement a fully deterministic distributed system that will never fail with an out of memory error, for predictable performance and 700x faster tests!

Ingesting in Rust

Armin Ronacher (Sentry)

Hear about building a Rust based ingestion service that handles hundreds of thousands of events per second with low latency globally.

Taming P99 Latencies at Lyft: Tuning Low-Latency Online Feature Stores

Bhanu Renukuntla (Lyft)

Explore the challenges and strategies of tuning low latency online feature stores to tame P99 latencies, shedding light on the importance of choosing the right data model.

The History of Tracing Oracle

Cary Millsap (Method R Corporation)

Delve into the history of tracing Oracle, why it has been overlooked despite its usefulness, and examples of how Oracle traces can help improve performance across your whole technology stack.

Building Low Latency ML Systems for Real-Time Model Predictions at Xandr

Chinmay Abhay Nerurkar Moussa Taifi (Microsoft)

Learn about the challenges of building an ML system with the low latency required to support the high volume and high throughput demands of ad serving.

Cost-Effective Burst Scaling For Distributed Query Execution

Dan Harris (Coralogix)

A case study in building a distributed execution model that can dynamically execute across both AWS Lambda and EC2 resources – shedding excess load to lambda functions to preserve low latency while scaling EC2 capacity to manage costs.

Cache Me If You Can: How Grafana Labs Scaled Up Their Memcached 42x & Cut Costs Too

Danny Kopping (Grafana Labs)

How Grafana Labs managed to increase their cache size by 42x and reduce costs by using a little-known feature of memcached called “extstore”.

Segment-Based Storage vs. Partition-Based Storage: Which is Better for Real-Time Data Streaming?

David Kjerrumgaard (StreamNative)

Explore key differences between segment-based and partition-based storage models (including how data is organized, stored, and accessed) with an eye toward what’s best for real-time data streaming system performance, scalability, and resiliency.

Demanding the Impossible: Rigorous Database Benchmarking

Dmitrii Dolgov (Red Hat)

An analysis of how to design an effective database benchmark, including selecting a mode, overcoming technical challenges, and analyzing the results (using PostgreSQL as an example).

Quantifying the Performance Impact of Shard-per-core Architecture

Dor Laor (ScyllaDB)

Most software isn’t architected to take advantage of modern hardware. How does a shard-per-code and shared-nothing architecture help – and exactly what impact can it make? Dor will examine technical opportunities and tradeoffs, as well as disclose the results of a new benchmark study.

Writing Low Latency Database Applications Even if Your Code Sucks

Glauber Costa (Turso)

How – by putting data close to its users –you can save hundreds of milliseconds and still be faster than the most optimized code … even if your code sucks.

ORM is Bad, But is There an Alternative?

Henrietta Dombrovskaya (DRW)

Why optimizing the application/database interaction is important and how the No-ORM framework provides an escape to common ORM pitfalls while maintaining their ease of use.

The Art of Event Driven Observability with OpenTelemetry

Henrik Rexed (Dynatrace)

Explore the various components of OpenTelemetry, examples of unuseful traces from event driven architecture, and the purpose/usage of span links in event driven architecture.

From 1M to 1B Features Per Second: Scaling ShareChat’s ML Feature Store

Ivan Burmistrov and Andrei Manakov (Sharechat)

A case study in building a low latency ML feature store (using ScyllaDB, Golang, and Flink) that handles 1B features per second, including data modeling tips for performance & scalability and caching strategies.

Distributed System Performance Troubleshooting Like You’ve Been Doing It for 20 Years

Jon Haddad (Rustyrazorblade Consulting)

Discover how to go about diagnosing performance problems in complex distributed systems and learn the tools and processes for getting to the bottom of any issue, quickly – even when it’s one of the biggest distributed database deployments on the planet.

Low-Latency Data Access: The Required Synergy Between Memory & Disk

Kriti Kathuria (University of Waterloo)

Learn about the synergy required between memory and disk to achieve efficient data processing and the general techniques that databases use for efficient data storage and retrieval.

Square’s Lessons Learned from Implementing a Key-Value Store with Raft

Omar Elgabry (Square)

The micro-lessons engineers can learn from Square’s experience building fault-tolerant, strongly consistent distributed systems using Raft.

MySQL Performance on Modern CPUs: Intel vs AMD vs ARM

Peter Zaitsev (Percona)

Look into the current CPU choices through a MySQL lens: which CPUs provide the best performance for single-threaded and high-concurrency workloads and which help to achieve the best price/performance.

Conquering Load Balancing: Experiences from ScyllaDB Drivers

Piotr Grabowski (ScyllaDB)

Get insight into the intricacies of load balancing within ScyllaDB drivers with Piotr sharing how we employed the Power of Two Choices algorithm, optimized the implementation of load balancing in Rust Driver, and more.

5 Hours to 7.7 Seconds: How Database Tricks Sped up Rust Linting Over 2000X

Predrag Gruevski (Trustfall)

A case study about using database ideas to build a linter that looks for breaking changes in Rust library APIs.

Adventures in Thread-per-Core Async with Redpanda and Seastar

Travis Downs (Redpanda)

A look at the practical experience of building high performance systems with C++20 in an asynchronous runtime, the unexpected simplicity that can come from strictly mapping data to cores, and the challenges & tradeoffs in adopting a thread-per-core architecture.

Automatically Sharding and Scaling-out Databases on Kubernetes

Trista Pan (SphereEx)

New ways to create a distributed/sharding database system based on your existing monolithic databases – without exacerbating data management, auto-scaling, and query performance issues.

Mitigating the Impact of State Management in Cloud Stream Processing Systems

Yingjun Wu (RisingWave Labs)

How RisingWave Labs is addressing the high latency issues associated with S3 storage in stream processing systems that employ a decoupled compute and storage architecture.

Unconventional Methods to Identify Bottlenecks in Low-Latency and High-Throughput Data Pipelines

Zamir Paltiel (Hyperspace)

Standard profiling and monitoring methods may fall short in identifying bottlenecks in low-latency data ingestion workflows; discover unconventional techniques to apply instead.

More To Explore

Building an HTTP Server on a Thread-per-Core Framework, without Async/Await

How to build a production-grade HTTP server without async/await or coroutines

Peter Mbanugo June 17, 2026

Why Queues Don’t Fix Overload (And What To Do Instead)

This post is about the physical laws of backpressure in software systems, latency death spirals, and why unbounded queues are a bug.

Peter Mbanugo June 3, 2026