Learn how we achieved 6.6M read OPS with sub-2ms latency on a Single ScyllaDB cluster in Kubernetes, optimizing machine types, shard-aware porting, and backup/recovery. We’ll cover how shard-aware drivers and ScyllaDB’s shard-per-core model cut latency to ~900 µs, and how we tuned machine types across Intel, AMD, and Google Axion hardware. The talk also details our GKE deployment with CPU pinning, host networking, and NVMe storage.
