SESSION ON-DEMAND

All Things P99

The event for developers who care about P99 percentiles and high-performance, low-latency applications

Scaling to 6.6M Read OPS with ScyllaDB on Kubernetes: Achieving Sub-2ms Latency and Robust Recovery

Learn how we achieved 6.6M read OPS with sub-2ms latency on a Single ScyllaDB cluster in Kubernetes, optimizing machine types, shard-aware porting, and backup/recovery. We’ll cover how shard-aware drivers and ScyllaDB’s shard-per-core model cut latency to ~900 µs, and how we tuned machine types across Intel, AMD, and Google Axion hardware. The talk also details our GKE deployment with CPU pinning, host networking, and NVMe storage.

22 minutes
Register for access to all 60+ sessions available on demand.
Fill out the form to watch this session from the P99 CONF 2025 livestream. You’ll also get access to all available recordings.

Shubham Sharma, Senior Systems Engineer at Verve Gorup

I’m a Cloud and DevOps Architect with over 11 years of experience crafting low-latency, high-performance distributed systems. Certified as a GCP Professional Architect, AWS Solutions Architect, and CKA, I specialise in AWS, GCP, NoSQL databases (ScyllaDB, Cassandra, MongoDB, Redis), and Kafka streaming. I’m thrilled about building tailored, real-time data streaming applications that deliver sub-millisecond performance. Currently, I’m working on ScyllaDB, Kafka, and data streaming solutions, where I find immense joy in optimising for speed and scale. My expertise includes Kubernetes, Docker, Helm, Terraform, Ansible, and monitoring with Prometheus, Grafana, and ELK, alongside CI/CD using Jenkins and ArgoCD. With Python and SQL, I design efficient streaming pipelines. In this session, I’ll share practical insights on building low-latency, resilient systems, drawing from my passion for high-performance architectures.