Filter Videos
Cache Me If You Can: How Grafana Labs Scaled Up Their Memcached 42x & Cut Costs Too
Danny Kopping
Senior Software Engineer at Grafana Labs
Our cloud database stores billions of files in object storage. With petabytes of data being queried every day, we started…
High Performance on a Low Budget
Gwen Shapira
Co-founder & CPO of Nile
It is one thing to solve performance challenges when you have plenty of time, money, and expertise available. Many performance…
From 1M to 1B Features Per Second: Scaling ShareChat’s ML Feature Store
Andrei Manakov
Staff Software Engineer at ShareChat
Ivan Burmistrov
Senior Staff Software Engineer at ShareChat
ShareChat’s Ivan Burmistrov and Andrei Manakov walk through how they built a low latency ML Feature Store based on ScyllaDB which…
Corporate Open Source Anti-Patterns: A Decade Later
Bryan Cantrill
CTO of Oxide Computer Company
A little over a decade ago, I gave a talk on corporate open source anti-patterns, vowing that I would return…
Quantifying the Performance Impact of Shard-per-core Architecture
Dor Laor
CEO of ScyllaDB
Most software isn’t architected to take advantage of modern hardware. How does a shard-per-code and shared-nothing architecture help – and…
How Netflix Builds High Performance Applications at Global Scale
Prasanna Vijayanathan
Senior Software Engineer at Netflix
We all want to build applications that are blazingly fast. We also want to scale them to users all over…
eBPF vs Sidecars
Liz Rice
Chief Open Source Officer at Isovalent
From its vantage point in the kernel, eBPF provides a platform for building a new generation of infrastructure tools for…
Taming P99 Latencies at Lyft: Tuning Low-Latency Online Feature Stores
Bhanu Renukuntla
Senior Software Engineer at Lyft
In this talk, we will explore the challenges and strategies of tuning low latency online feature stores to tame the…
Running a Go App in Kubernetes: CPU Impacts
Teiva Harsanyi
Senior Software Engineer at Google
Understanding the impacts of running a containerized Go application inside Kubernetes with a focus on the CPU.
Expanding Horizons: A Case for Rust Higher Up the Stack
Carl Lerche
Principal Engineer at AWS
Historically associated with systems programming due to its roots in Mozilla, Rust’s promise of safety, speed, and concurrency has led…
How to Improve Your Ability to Solve Complex Performance Problems
Kerry Osborne
Google Database Black Belt Team Lead at Google
This talk is really about problem solving. It’s about how we think about problems and how we resolve those problems…
Square’s Lessons Learned from Implementing a Key-Value Store with Raft
Omar Elgabry
Software Engineer at Square
To put it simply, Raft is used to make a use case (e.g., key-value store, indexing system) more fault tolerant…
Performance Budgets for the Real World
Tammy Everts
Chief Experience Officer at SpeedCurve
Performance budgets have been around for more than ten years. Over those years, we’ve learned a lot about what works,…
A Deterministic Walk Down TigerBeetle’s main() Street
Aleksei Kladov
Staff Software Engineer at TigerBeetle
Learn how to use Zig to implement a fully deterministic distributed system which will never fail with an out of…
VM Performance: The Differences Between Static Partitioning or Automatic Tuning
Dario Faggioli
Virtualization Software Engineer at SUSE
Virtualized workloads are known to require carefully crafted configuration and tuning, both at the host and at the guest level,…
Measuring the Impact of Network Latency at Twitter
Widya Salim
Data Scientist at SEEK
Victor Ma
Senior Data Scientist at Airwallex
Zhen Li
Data Scientist at TikTok
Widya Salim, Victor Ma, and Zhen Li will outline the causal impact analysis, framework, and key learnings used to quantify…
Conquering Load Balancing: Experiences from ScyllaDB Drivers
Piotr Grabowski
Software Team Leader at ScyllaDB
Load balancing seems simple on the surface, with algorithms like round-robin, but the real world loves throwing curveballs. Join me…
Low-Latency Data Access: The Required Synergy Between Memory & Disk
Kriti Kathuria
Graduate Researcher at the University of Waterloo
Analytics has moved from internal dashboards to a dashboard inside the product, providing a personalized experience for each user, be…
Distributed System Performance Troubleshooting Like You’ve Been Doing it for Twenty Years
Jon Haddad
Founder at Rustyrazorblade Consulting
Troubleshooting performance issues across distributed systems can be intimidating if you don’t know where to start, and it’s even harder…
Writing Low Latency Database Applications Even If Your Code Sucks
Glauber Costa
Founder & CEO of Turso
All latency lovers are used to the mystical experience of joy coming from aligning data to a cache line size…
Using Libtracecmd to Analyze Your Latency and Performance Troubles
Steven Rostedt
Software Engineer at Google
Trying to figure out why your application is responding late can be difficult, especially if it is because of interference…
Building Low Latency ML Systems for Real-Time Model Predictions at Xandr
Chinmay Abhay Nerurkar
Principal Engineer at Microsoft
Xandr’s Ad-server handles over 400 billion daily ad requests from across the world wide web. Operating under a stringent Service…
ORM is Bad, But is There an Alternative?
Henrietta Dombrovskaya
Database Architect at DRW
It’s a well-known fact, that although the database performance is great, and each query is executed in milliseconds, the overall…
P99 Publish Performance in a Multi-Cloud NATS.io System
Derek Collison
Founder & CEO of Synadia
This talk will walk through the strategies and improvements made to the NATS server to accomplish P99 goals for persistent…
Making Python 100x Faster with Less Than 100 Lines of Rust
Ohad Ravid
Team Lead at Trigo
Python isn’t known as a low-latency language. Can we bridge the performance gap using a bit of Rust and some…
Zero Downtime Critical Traffic Migration @Netflix Scale
Abhishek Pandey
Senior Software Engineer at Meta
Picture yourself enthralled by the latest episode of your beloved Netflix series, delighting in an uninterrupted, high-definition streaming experience. Behind…
The History of Tracing Oracle
Cary Millsap
Owner and President of Method R Corporation
In this presentation, I will explore the history of tracing Oracle and why it has been overlooked despite its usefulness.…
Always-on Profiling of All Linux Threads, On-CPU and Off-CPU, with eBPF & Context Enrichment
Tanel Poder
Owner at Poder Consulting
In this session, Tanel introduces a new open source eBPF tool for efficiently sampling both on-CPU events and off-CPU events…
Cost-Effective Burst Scaling For Distributed Query Execution
Dan Harris
Principal Software Engineer at Coralogix
Building a query engine that scales efficiently is a difficult task. Queries over big datasets stored in Object Storage require…
Unconventional Methods to Identify Bottlenecks in Low-Latency and High-Throughput Data Pipelines
Zamir Paltiel
Head of Engineering at Hyperspace
In this presentation, we explore how standard profiling and monitoring methods may fall short in identifying bottlenecks in low-latency data…
Mitigating the Impact of State Management in Cloud Stream Processing Systems
Yingjun Wu
CEO of RisingWave Labs
Stream processing is a crucial component of modern data infrastructure, but constructing an efficient and scalable stream processing system can…
Practical Go Memory Profiling
William Kennedy
Managing Partner at Ardan Labs
In this talk, Bill will show you how to use benchmark profiling in and compiler directives in Go to find…
Adventures in Thread-per-Core Async with Redpanda and Seastar
Travis Downs
Software Engineer at Redpanda
Thread-per-core programming models are well known in software domains where latency is important. Pinning application threads to physical cores and…
Architecting a High-Performance (Open Source) Distributed Message Queuing System in C++
Vitaly Dzhitenov
Senior Software Engineer at Bloomberg
BlazingMQ is a new open source* distributed message queuing system developed at and published by Bloomberg. It provides highly-performant queues…
Noise Canceling RUM
Tim Vereecke
Web Performance Architect at Akamai
Noisy Real User Monitoring (RUM) data can ruin your P99! We introduce a fresh concept called “Human Visible Navigations” (HVN)…
Less Wasm
Piotr Sarna
Staff Software Engineer at Turso
The presentation explains why getting rid of WebAssembly is good for your latency. More specifically, it’s a short case study…
Reducing P99 Latencies with Generational ZGC
Stefan Johansson
Principle Member of Technical Staff at Oracle
With the low-latency garbage collector ZGC, GC pause times are no longer a big problem in Java. With sub-millisecond pause…
5 Hours to 7.7 Seconds: How Database Tricks Sped up Rust Linting Over 2000X
Predrag Gruevski
Independent Software Researcher at Trustfall
Linters are a type of database! They are a collection of lint rules — queries that look for rule violations…
Interaction Latency: Square’s User-Centric Mobile Performance Metric
Pierre-Yves Ricau
Android Distinguished Engineer at Block
Mobile performance metrics often take inspiration from the backend world and measure resource usage (CPU usage, memory usage, etc) and…
Chihuahua-Sized Load Tests!
Leandro Melendez
DevRel Performance Advocate at Grafana k6
Because bigger isn’t always better. Especially nowadays.Do your teams need help accommodating those humongous load tests in your agile &…
How to Avoid Learning the Linux-Kernel Memory Model
Paul McKenney
Software Engineer at Meta
The Linux-kernel memory model (LKMM) is a powerful tool for developing highly concurrent Linux-kernel code, but it also has a…
MySQL Performance on Modern CPUs: Intel vs AMD vs ARM
Peter Zaitsev
Founder of Percona
For years CPU choice for MySQL was pretty boring – just chose what Intel Made CPU you want. In recent…
How We Reduced the Startup Time for Turo’s Android App by 77%
Pavlo Stavytskyi
Sr. Staff Software Engineer at Turo
The startup time of a mobile app is one of the most important indicators of its performance and has a…
99.99% of Your Traces are Trash
Paige Cruz
Senior Developer Advocate at Chronosphere
Distributed tracing is still finding its footing in many organizations today, one challenge to overcome is the data volume –…
High-Level Rust for Backend Programming
Adam Chalmers
Systems Engineer at KittyCAD, Inc.
Some people say you should only use Rust where you can’t afford to use garbage collection. I disagree — Rust…
A Deep Dive Into Concurrent React
Matheus Albuquerque
Senior Software Engineer, Front-End at Medallia
Writing fluid user interfaces becomes more and more challenging as the application complexity increases. In this talk, we’ll explore how…
Ingesting in Rust
Armin Ronacher
Creator of Flask and Principal Architect at Sentry
At Sentry we handle hundreds of thousands of events a second — from tiny metric to huge memory dump. What…
The Latency Stack: Discovering Surprising Sources of Latency
Mark Gritter
Principal Engineer at Postman
Usually, when an API call is slow, developers blame ourselves and our code. We held a lock too long, or…
Building a 10x More Efficient Edge Platform
Felipe Huici
CEO and Co-Founder of Unikraft UG
Painful cold boots, terrible auto-scale times, minutes-long waits for compute nodes to be up: these are standard headaches that cloud…
Beyond Availability: The Seven Dimensions for Data Product SLOs
Emily Gorcenski
Principal Data Scientist at Thoughtworks
In the software world, we’re used to SLOs built around latency and availability. But in the data engineering universe, there…
Peak Performance at the Edge: Running Razorpay’s High-Scale API Gateway
Jay Pathak
Software Development Engineer at Razorpay
Razorpay caters to millions of API requests every day that are non-uniform in nature. As a key provider of financial…
Segment-Based Storage vs. Partition-Based Storage: Which is Better for Real-Time Data Streaming?
David Kjerrumgaard
Developer Advocate at StreamNative
Storage is a critical component of any real-time data streaming system, and the choice of storage model can significantly affect…
HTTP 3: Moving on From TCP
Brian Sletten
President at Bosatsu Consulting, Inc.
Any network class you have taken in the last thirty years will have highlighted that the application layer depends on…
Demanding the Impossible: Rigorous Database Benchmarking
Dmitrii Dolgov
Senior Software Engineer at Red Hat
It’s easy to conduct a misleading benchmark, and notoriously hard to design a correct and rigorous enough one. Have you…
The Art of Macro Benchmarking: Evaluating Cloud Native Services Efficiency
Bartłomiej Płotka
Senior Software Engineer at Google
Benchmarking is hard, especially on a macro level that integrates multiple code components into one or multiple microservices. It’s challenging…
The Art of Event Driven Observability with OpenTelemetry
Henrik Rexed
Cloud Native Advocate at Dynatrace
Explore the various components of OpenTelemetry, examples of unuseful traces from event driven architecture, and the purpose/usage of span links…