Register now and enter to win an exclusive pre-show swag pack!

Virtual Event | OCTOBER 19-20, 2022

Call For Speakers Open

Submit a talk for P99 CONF, the event for developers who care about high-performance, low-latency applications.

About P99 CONF

P99 CONF is a cross-industry virtual event for _engineers_ and by engineers. The event is centered around low-latency, high-performance design — ranging from OS (kernel, eBPF, IO_uring), CPUs (Arm, Intel, OpenRisc), middleware and languages (go, Rust, JVM, DPDK), databases and observability methods.

Find Your
Inspiration

Discover the latest methods in systems development and operational best practices for high-performance computing.

Share Your Team’s Ingenuity

Showcase your team’s success at achieving massive scale while maintaining lowest latencies. Compare notes with your industry peers.

Join the Webscale™ Revolution

Take lessons learned back to your organization and be part of the movement for ever-faster computing and big data solutions.

Overheard at P99 CONF

Follow us on Twitter @p99conf for the latest updates.

Full agenda will be announced this summer.

2021 Sessions On Demand

Keynotes

Steven Rostedt, Open Source Engineer at VMware

New Ways to Find Latency in Linux Using Tracing

Steven Rostedt, Open Source Engineer at VMware

Ftrace is the official tracer of the Linux kernel. It originated from the real-time patch (now known as PREEMPT_RT), as developing an operating system for real-time use requires deep insight and transparency of the happenings of the kernel. Not only was tracing useful for debugging, but it was critical for finding areas in the kernel that was causing unbounded latency. It's no wonder why the ftrace infrastructure has a lot of tooling for seeking out latency. Ftrace was introduced into mainline Linux in 2008, and several talks have been done on how to utilize its tracing features. But a lot has happened in the past few years that makes the tooling for finding latency much simpler. Other talks at P99 will discuss the new ftrace tracers "osnoise" and "timerlat", but this talk will focus more on the new flexible and dynamic aspects of ftrace that facilitates finding latency issues which are more specific to your needs. Some of this work may still be in a proof of concept stage, but this talk will give you the advantage of knowing what tools will be available to you in the coming year.
Watch now »

Marc Richards, Performance Engineer at Talawah Solutions

Extreme HTTP Performance Tuning: 1.2M API req/s on a 4 vCPU EC2 Instance

Marc Richards, Performance Engineer at Talawah Solutions

In this talk I will walk you through the performance tuning steps that I took to serve 1.2M JSON requests per second from a 4 vCPU c5 instance, using a simple API server written in C. At the start of the journey the server is capable of a very respectable 224k req/s with the default configuration. Along the way I made extensive use of tools like FlameGraph and bpftrace to measure, analyze, and optimize the entire stack, from the application framework, to the network driver, all the way down to the kernel. I began this wild adventure without any prior low-level performance optimization experience; but once I started going down the performance tuning rabbit-hole, there was no turning back. Fueled by my curiosity, willingness to learn, and relentless persistence, I was able to boost performance by over 400% and reduce p99 latency by almost 80%.
Watch now »

Avi Kivity, CTO & Co-Founder at ScyllaDB

Keeping Latency Low and Throughput High with Application-level Priority Management

Avi Kivity, CTO & Co-Founder at ScyllaDB

Throughput and latency are at a constant tension. ScyllaDB CTO and co-founder Avi Kivity will show how high throughput and low latency can both be achieved in a single application by using application-level priority scheduling.
Watch now »

Bryan Cantrill, CTO of Oxide Computer Company

Rust, Wright’s Law, and the Future of Low-Latency Systems

Bryan Cantrill, CTO of Oxide Computer Company

The coming decade will see two important changes with profound ramifications for low-latency systems: the rise of Rust-based systems, and the ceding of Moore's Law to Wright's Law. In this talk, we will discuss these two trends, and (especially) their confluence -- and explain why we believe that the future of low-latency systems will include Rust programs in some surprising places.
Watch now »

Brian Martin, Software Engineer at Twitter

Whoops! I Rewrote It in Rust

Brian Martin, Software Engineer at Twitter

Three engineers, at various points, each take their own approach adding Rust to a C codebase, each being more and more ambitious. I initially just wanted to replace the server’s networking and event loop with an equally fast Rust implementation. We’d reuse many core components that were in C and just call into them from Rust. Surely it wouldn’t be that much code… Pelikan is Twitter’s open source and modular framework for in-memory caching, allowing us to replace Memcached and Redis forks with a single codebase and achieve better performance. At Twitter, we operate hundreds of cache clusters storing hundreds of terabytes of small objects in memory. In-memory caching is critical, and demands performance, reliability, and efficiency. In this talk, I’ll share my adventures in working on Pelikan and how rewriting it in Rust can be more than just a meme.
Watch now »

Engineering Talks

Gunnar Morling, Principal Software Engineer at Red Hat

Continuous Performance Regression Testing with JfrUnit

Gunnar Morling, Principal Software Engineer at Red Hat

Functional unit and integration tests are a common practice to detect and prevent regressions within a software component or application's behavior. Things look different, though, when it comes to performance-related aspects: how to identify an application is slower than it used to be? How to spot higher memory consumption than before? How to find out about sub-optimal SQL queries that sneaked in? Any performance tests based on metrics like wall-clock time or through-put are not portable. They are subject to a specific execution environment such as a developer laptop, CI, or production-like environment. Welcome JfrUnit: based on the JDK Flight Recorder (JFR), it allows you to implement assertions based on all kinds of JFR events emitted by the JVM or your application. JfrUnit makes it very easy to identify potential performance issues by asserting metrics that may impact your application's performance, like an increased object allocation rate, retrieval of redundant data from the database, loading of unneeded classes, and much more. Come and join us for this code-centric session to learn about: * Using JDK Flight Recorder and JfrUnit for implementing performance regression tests * Emitting JFR events from 3rd party libraries using JMC Agent * Analyzing performance regressions in JDK Mission Control
Watch now »

Dhruba Borthakur, CTO of Rockset

Realtime Indexing for Fast Queries on Massive Semi-Structured Data

Dhruba Borthakur, CTO of Rockset

Rockset is a realtime indexing database that powers fast SQL over semi-structured data such as JSON, Parquet, or XML without requiring any schematization. All data loaded into Rockset are automatically indexed and a fully featured SQL engine powers fast queries over semi-structured data without requiring any database tuning. Rockset exploits the hardware fluidity available in the cloud and automatically grows and shrinks the cluster footprint based on demand. Available as a serverless cloud service, Rockset is used by developers to build data-driven applications and microservices.

In this talk, we discuss some of the key design aspects of Rockset, such as Smart Schema and Converged Index. We describe Rockset's Aggregator Leaf Tailer (ALT) architecture that provides low latency queries on large datasets.Then we describe how you can combine lightweight transactions in ScyllaDB with realtime analytics on Rockset to power an user-facing application.

Watch now »

Daniel Bristot de Oliveira, Principal Software Engineer at Red Hat

OSNoise Tracer: Who Is Stealing My CPU Time?

Daniel Bristot de Oliveira, Principal Software Engineer at Red Hat

In the context of high-performance computing (HPC), the Operating System Noise (osnoise) refers to the interference experienced by an application due to activities inside the operating system. In the context of Linux, NMIs, IRQs, softirqs, and any other system thread can cause noise to the application. Moreover, hardware-related jobs can also cause noise, for example, via SMIs. HPC users and developers that care about every microsecond stolen by the OS need not only a precise way to measure the osnoise but mainly to figure out who is stealing cpu time so that they can pursue the perfect tune of the system. These users and developers are the inspiration of Linux's osnoise tracer. The osnoise tracer runs an in-kernel loop measuring how much time is available. It does it with preemption, softirq and IRQs enabled, thus allowing all the sources of osnoise during its execution. The osnoise tracer takes note of the entry and exit point of any source of interferences. When the noise happens without any interference from the operating system level, the tracer can safely point to a hardware-related noise. In this way, osnoise can account for any source of interference. The osnoise tracer also adds new kernel tracepoints that auxiliaries the user to point to the culprits of the noise in a precise and intuitive way. At the end of a period, the osnoise tracer prints the sum of all noise, the max single noise, the percentage of CPU available for the thread, and the counters for the noise sources, serving as a benchmark tool.
Watch now »

Waldek Kozaczuk, OSv Committer

OSv Unikernel — Optimizing Guest OS to Run Stateless and Serverless Apps in the Cloud

Waldek Kozaczuk, OSv Committer

Unikernels have been demonstrated to deliver excellent performance in terms of throughput and latency, while providing high isolation. However they have also been shown to underperform in some types of workloads when compared to a generic OS like Linux. In this presentation, we demonstrate that certain types of workloads - web servers, microservices, and other stateless and/or serverless apps - can greatly benefit from OSv optimized networking stack and other features. We describe number of experiments where OSv outperforms Linux guest: most notably we note 1.6 throughput (req/s) and 0.6 latency improvement (at p99 percentile) when running nginx and 1.7 throughput (req/s) and 0.6 latency improvement (at p99 percentile) when running simple microservice implemented in Golang.
We also show that OSv' small kernel, low boot time and memory consumption allow for very high density when running server-less workloads. The experiment described in this presentation shows we can boot 1,800 OSv microVMs per second on AWS c5n.metal machine with 72 CPUs (25 boots/sec on single CPU) with guest boot time recorded as low as 8.98ms at p50 and 31.49ms at p99 percentile respectively.
Lastly we also demonstrate how to automate the build process of the OSv kernel tailored exactly to the specific app and/or VMM so that only the code and symbols needed are part of the kernel and nothing more. OSv is an open source project and can be found at https://github.com/cloudius-systems/osv.
SHOW LESS
Watch now »

Heinrich Hartmann, Principal Engineer at Zalando

How to Measure Latency

Heinrich Hartmann, Principal Engineer at Zalando

Measuring Latency for Monitoring and Benchmarking purposes is notoriously difficult. There are a lot of pitfalls with collecting, aggregating and analyzing latency data. In the talk, we will make an effort to visit this topic from a top-down perspective and compile known complications and best-practice approaches on how to avoid them. This will include:
  • Measurement Overhead
  • Queuing effects - Coordinated omission
  • Histograms for Aggregation and Visualization
  • Percentile aggregation
  • Latency bands and burn-down charts
  • Latency comparison methods (QQ Plots, KS-Distance)
Watch now »

Glauber Costa, Staff Engineer at DataDog

Rust Is Safe. But Is It Fast?

Glauber Costa, Staff Engineer at DataDog

Rust promises developers the execution speed of non-managed languages like C++, with the safety guarantees of managed languages like Go. Its fast rise in popularity shows this promise has been largely upheld. However, the situation is a bit muddier for the newer asynchronous extensions. This talk will explore some of the pitfalls that users may face while developing asynchronous Rust applications that have direct consequences in their ability to hit that sweet low p99. We will see how the Glommio asynchronous executor tries to deal with some of those problems, and what the future holds.
Watch now »

Stefan Johansson, OpenJDK GC Engineer at Oracle

G1: To Infinity and Beyond

Stefan Johansson, OpenJDK GC Engineer at Oracle

G1 has been around for quite some time now and since JDK 9 it is the default garbage collector in OpenJDK. The community working on G1 is big and the contributions over the last few years have made a significant impact on the overall performance. This talk will focus on some of these features and how they have improved G1 in various ways, including smaller memory footprint and shorter P99 pause times. We will also take a brief look at what features we have lined up for the future.
Watch now »

Yarden Shafir, Software Engineer at Crowdstrike

I/O Rings and You — Optimizing I/O on Windows

Yarden Shafir, Software Engineer at Crowdstrike

Very recently Windows decided to go on the same path as Linux and implement I/O rings - a way to queue multiple I/O operations at a time instead of one by one. This change is expected to have major impact on the performance and efficiency of high-I/O applications, thus keeping Windows servers competitive. This session will present this new feature, its implementation and demonstrate how it should be used and discuss potential future additions to it to further improve the handling of I/O by complex systems.
Watch now »

Filipe Oliveira, Performance Engineer at Redis

Data Structures for High Resolution, Real-time Telemetry at Scale

Filipe Oliveira, Performance Engineer at Redis

The challenge within telemetry in real-time systems is that you need as many sources of telemetry as possible (Throughput, latency, Errors, CPU, and many more... ) but you can't pay for extra overhead when our users are expecting sub-ms ops that scale to millions of transactions per second. In this talk, we'll describe how we're using and improving several OSS data structures to incorporate telemetry features at scale, and showcase why they do matter on scenarios in which we have Performance/Security/Ops issues.

Watch now »

Karthik Ramasamy, Senior Director of Engineering at Splunk

Scaling Apache Pulsar to 10 Petabytes/Day

Karthik Ramasamy, Senior Director of Engineering at Splunk

Pulsar is used by a portfolio of products at Splunk for stream processing of different types of data, including metrics and logs. In this talk, Karthik Ramasamy will share how Splunk helped a flagship customer scale a Pulsar deployment to handle 10 PB/day in a single cluster. He will talk about the journey, the challenges faced, and the trade-offs made to scale Pulsar and operate it reliably and stably in Google Cloud Platform (GCP).
Watch now »

Kathy Giori, Ecosystem Engagement Lead at ZEDEDA

Roman Shaposhnik, Co-Founder of ZEDEDA Inc.

RISC-V on Edge: Porting EVE and Alpine Linux to RISC-V

Kathy Giori, Ecosystem Engagement Lead at ZEDEDA

Roman Shaposhnik, Co-Founder of ZEDEDA Inc.

What is special about community-industry collaboration around an open architecture like RISC-V is that the community can join development around key kernel subsystems in parallel with the semiconductor companies and manufacturers as they are producing the new chipsets and boards. Open silicon design, open board design, and open kernel software speeds bring-up of new boards and fosters greater innovation due to the diversity of talent contributing to the technology. For many kernel developers, tackling RISC-V is becoming harder to resist now that commercially viable and scalable hardware has begun to enter the market. A private beta program launched by the BeagleBoard Foundation in March 2021 placed an affordable, yet powerful prototype RISC-V board into the hands of Roman Shaposhnik. What did Roman do? He joined a diverse team of software and hardware hackers to tackle bringing up Linux on the new board. In this talk, Roman will tell you about his experience porting Alpine Linux and LF Edge EVE-OS to the new RISC-V architecture.
Watch now »

Denis Rystsov, Staff Engineer at Vectorized

Is It Faster to Go with Redpanda Transactions than Without Them?!

Denis Rystsov, Staff Engineer at Vectorized

We all know that distributed transactions are expensive, have higher latency and lower throughput compared to a non-transactional workload. It's just common sense that when we ask a system to maintain transactional guarantees it should spend more time on coordination and thus have poorer performance, right? Well, it's true that we can't get rid of this overhead. But at the same time each transaction defines a unit of work, so the system stops dealing with individual requests and becomes more aware about the whole workload. Basically it gets more information and may use it for new kinds of optimizations which compensate for the overhead. In this talk I'll describe how Redpanda optimized the Kafka API and pushed throughput of distributed transactions up to eight times beyond an equivalent non-transactional workload while preserving sane latency.
Watch now »

Orit Wasserman, Architect at Red Hat

Crimson: Ceph for the Age of NVMe and Persistent Memory

Orit Wasserman, Architect at Red Hat

Ceph is a mature open source software-defined storage solution that was created over a decade ago. During that time new faster storage technologies have emerged including NVMe and Persistent memory. The crimson project aim is to create a better Ceph OSD that is more well suited to those faster devices. The crimson OSD is built on the Seastar C++ framework and can leverage these devices by minimizing latency, cpu overhead, and cross-core communication. This talk will discuss the project design, our current status, and our future plans.
Watch now »

Peter Zaitsev, CEO and Co-Founder of Percona

Performance Analysis and Troubleshooting Methodologies for Databases

Peter Zaitsev, CEO and Co-Founder of Percona

Have you heard about the USE Method (Utilization - Saturation - Errors), RED (Rate - Errors - Duration), or Golden Signals (Latency - Traffic - Errors - Saturations)? In this presentation, we will talk briefly about these different, but similar “focuses” and discuss how we can apply them to the data infrastructure performance analysis troubleshooting, and monitoring. We will use MySQL as an example but most of the talk will apply to other database technologies as well. Outline to use if needed. - Introduce the Challenge of Troubleshooting by Random Googling (1min) - Introduce USE Method, how it applies to databases (5 min) - Introduce RED Method, how it applies to databases (5 min) - Introduce Golden Signals (4 min) - Provide a High-Level Comparison of Methods as a takeaway (4 min)
Watch now »

Sam Just, Senior Principal Software Engineer at Red Hat

Seastore: Next Generation Backing Store for Ceph

Sam Just, Senior Principal Software Engineer at Red Hat

Ceph is an open source distributed file system addressing file, block, and object storage use cases. Next generation storage devices require a change in strategy, so the community has been developing crimson-osd, an eventual replacement for ceph-osd intended to minimize cpu overhead and improve throughput and latency. Seastore is a new backing store for crimson-osd targeted at emerging storage technologies including persistent memory and ZNS devices.
Watch now »

Tejas Chopra, Senior Software Engineer at Netflix

Object Compaction in Cloud for High Yield

Tejas Chopra, Senior Software Engineer at Netflix

In file systems, large sequential writes are more beneficial than small random writes, and hence many storage systems implement a log structured file system. In the same way, the cloud favors large objects more than small objects. Cloud providers place throttling limits on PUTs and GETs, and so it takes significantly longer time to upload a bunch of small objects than a large object of the aggregate size. Moreover, there are per-PUT calls associated with uploading smaller objects. In Netflix, a lot of media assets and their relevant metadata is generated and pushed to cloud. We would like to propose a strategy to compact these small objects into larger blobs before uploading them to Cloud. We will discuss how to select relevant smaller objects, and manage the indexing of these objects within the blob along with modification in reads, overwrites and deletes. Finally, we would showcase the potential impact of such a strategy on Netflix assets in terms of cost and performance.
Watch now »

Thomas Dullien, CEO of optimyze.cloud Inc.

Where Did All These Cycles Go?

Thomas Dullien, CEO of optimyze.cloud Inc.

Modern systems are large, and complicated, and it is often difficult to account precisely where CPU cycles are spent in production. Once you begin measuring, you will find all sorts of strange surprises - like cleaning out strange objects from an attic that has accumulated stuff for decades. This talk discusses surprising places where we found CPU waste in real-world production environments: From Kubelet consuming multiple percent of whole-cluster CPU, via popular machine learning libraries spending their time juggling exceptions instead of classifying, to EC2 time sources being much slower than necessary. CPU cycles are being lost in surprising places, and often it isn't in your own code.
Watch now »

Simon Ritter, Deputy CTO at Azul Systems

Get Lower Latency and Higher Throughput for Java Applications

Simon Ritter, Deputy CTO at Azul Systems

Getting the best performance out of your Java applications can often be a challenge due to the managed environment nature of the Java Virtual Machine and the non-deterministic behaviour that this introduces. Automatic garbage collection (GC) can seriously affect the ability to hit SLAs for the 99th percentile and above. This session will start by looking at what we mean by speed and how the JVM, whilst extremely powerful, means we don’t always get the performance characteristics we want. We’ll then move on to discuss some critical features and tools that address these issues, i.e. garbage collection, JIT compilers, etc. At the end of the session, attendees will have a clear understanding of the challenges and solutions for low-latency Java.
Watch now »

Pavel Emelyanov, Developer at ScyllaDB

What We Need to Unlearn about Persistent Storage

Pavel Emelyanov, Developer at ScyllaDB

System software engineers have long been taught that disks are slow and sequential I/O is key to performance. With SSD drives I/O really got much faster but not simpler. In this brave new world of rocket-speed throughputs an engineer has to distinguish sustained workload from bursts, (still) take care about I/O buffer sizes, account for disks' internal parallelism and study mixed I/O characteristics in advance. In this talk we will share some key performance measurements of the modern hardware we're taking at ScyllaDB and our opinion about the implications for the database and system software design.
Watch now »

Konstantin Osipov, Director of Software Engineering at ScyllaDB

Avoiding Data Hotspots at Scale

Konstantin Osipov, Director of Software Engineering at ScyllaDB

There are two key choices when scaling a NoSQL data store: choosing between a hash or a range based sharding and choosing the right sharding key. Any choice is a trade-off between scalability of read, append, and update workloads. In this talk I will present the standard scaling techniques, some non-universal sharding tricks, less obvious reasons for hotspots, as well as techniques to avoid them.
Watch now »

Henrik Rexed, Cloud Native Advocate at Dynatrace

Using eBPF to Measure the k8s Cluster Health

Henrik Rexed, Cloud Native Advocate at Dynatrace

As a k8s cluster-admin your app teams have a certain expectation of your cluster to be available to deploy services at any time without problems. While there is no shortage on metrics in k8s its important to have the right metrics to alert on issues and giving you enough data to react to potential availability issues. Prometheus has become a standard and sheds light on the inner behaviour of Kubernetes clusters and workloads. Lots of KPIs (CPU, IO, network. Etc) in our On-Premise environment are less precise when we start to work in a Cloud environment. Ebpf is the perfect technology that fulfills that requirement as it gives us information down to the kernel level. In 2018 Cloudflare shared an opensource project to expose custom ebpf metrics in Prometheus. Join this session and learn about: • What is ebpf? • What type of metrics we can collect? • How to expose those metrics in a K8s environment. This session will try to deliver a step-by-step guide on how to take advantage of the ebpf exporter.
Watch now »

Felix Geisendörfer, Staff Engineer at Datadog

Continuous Go Profiling & Observability

Felix Geisendörfer, Staff Engineer at Datadog

This presentation is for Go developers and operators of Go applications who are interested in reducing costs and latency, or debugging problems such as memory leaks, infinite loops, performance regressions, etc. of such applications. We'll start with a brief description of the unique aspects of the Go runtime, and then take a look at the builtin profilers as well as Go's execution tracer. Additionally we'll look at the interoperability with popular observability tools such as Linux perf and bpftrace. After this presentation you should have a good idea of the various tools you can use, and which ones might be the most useful to you in a production environment.
Watch now »

Felipe Huici, Chief Researcher at NEC Europe Laboratories GmbH

Unikraft: Fast, Specialized Unikernels the Easy Way

Felipe Huici, Chief Researcher at NEC Europe Laboratories GmbH

Unikernels are famous for providing excellent performance in terms of boot times, throughput and memory consumption, to name a few metrics. However, they are infamous for making it hard and extremely time consuming to extract such performance, and for needing significant engineering effort in order to port applications to them. We introduce Unikraft, a novel micro-library OS that (1) fully modularizes OS primitives so that it is easy to customize the unikernel and include only relevant components and (2) exposes a set of composable, performance-oriented APIs in order to make it easy for developers to obtain high performance. Our evaluation using off-the-shelf applications such as nginx, SQLite, and Redis shows that running them on Unikraft results in a 1.7x-2.7x performance improvement compared to Linux guests. In addition, Unikraft images for these apps are around 1MB, require less than 10MB of RAM to run, and boot in around 1ms on top of the VMM time (total boot time 3ms-40ms). Unikraft is a Linux Foundation open source project and can be found at www.unikraft.org.
Watch now »

Pere Urbón-Bayes, Senior Solutions Architect at Confluent

Understanding Apache Kafka P99 Latency at Scale

Pere Urbón-Bayes, Senior Solutions Architect at Confluent

Apache Kafka is a highly popular distributed system used by many organizations to connect systems, build microservices, create data mesh, etc. However, as a distributed system, understanding its performance could be a challenge, so many moving parts exist. In this talk, we are going to review the key moving parts (producers, consumers, replication, network, etc), a strategy to measure and interpret the performance results for consumers and producers and a general guideline for deciding about performance in Apache Kafka. An attendee will take home after the talk a proven method to measure, evaluate and optimize the performance of an Apache Kafka based infrastructure. A key skill for low throughput users, but especially for the biggest scale deployments.
Watch now »

Bryan McCoid, Sr. Distributed Systems Engineer, Couchbase Inc.

High-Performance Networking Using eBPF, XDP, and io_uring

Bryan McCoid, Sr. Distributed Systems Engineer, Couchbase Inc.

In the networking world there are a number of ways to increase performance over naive use of basic Berkeley sockets. These techniques have ranged from polling blocking sockets, non-blocking sockets controlled by Epoll, all the way through completely bypassing the Linux kernel for maximum network performance where you talk directly to the network interface card by using something like DPDK or Netmap. This talk will dive into crucial details, such as how AF_XDP works, how it can be integrated into a larger system and finally more advanced topics such as request sharding/load balancing. There will be detailed look at the design of AF_XDP, the eBpf code used, as well as the userspace code required to drive it all. It will also include performance numbers from this setup compared to regular kernel networking. And most importantly how to put all this together to handle as much data as possible on a single modern multi-core system.

Watch now »

Doug Hood, Consulting Member of Technical Staff at Oracle

DB Latency Using DRAM + PMem in App Direct & Memory Modes

Doug Hood, Consulting Member of Technical Staff at Oracle

How does the latency of DDR4 DRAM compare to Intel Optane Persistent Memory when used in both App Direct and Memory Modes for In-Memory database access? This talk is about the latency benchmarks that I performed by adding gettimeofday() calls around critical DB kernel operations. This talk covers the technology, cache hit ratios, lots of histograms and lessons learned.
Watch now »

Peter Portante, Senior Principal Software Engineer at Red Hat

Let’s Fix Logging Once and for All

Peter Portante, Senior Principal Software Engineer at Red Hat

We describe a modification to the Linux Kernel which gives an SRE control over the combined bandwidth of logging on a node of a distributed system, while providing a way for the logging source owner (container or service) to control what happens when the bandwidth limit is hit.
Watch now »

Andreas Grabner, DevOps Activist at Dynatrace

Using SLOs for Continuous Performance Optimizations of Your k8s Workloads

Andreas Grabner, DevOps Activist at Dynatrace

Moving to k8s doesn’t prevent anyone from bad architectural decisions leading to performance degradations, scalability issues or violating your SLOs in production. In fact – building smaller services running in pods connected through service meshes are even more vulnerable to bad architectural or implementation choices. To avoid any bad deployments, the CNCF project Keptn provides automated SLO-based Performance Analysis as part of your CD process. Keptn automatically detects architectural and deployment changes that have a negative impact to performance and scalability. It uses SLOs (Service Level Objectives) to ensure your services always meet your objectives. The Keptn team has also put out SLO best practices to identify well known performance patterns that have been identified over the years analyzing hundreds of distributed software architectures deployed on k8s. Join this session and learn what these patterns are and how Keptn helps you prevent them from entering production
Watch now »

Abel Gordon, Chief Systems Architect at Lightbits Labs

Vanquishing Latency Outliers in the Lightbits LightOS Software Defined Storage System

Abel Gordon, Chief Systems Architect at Lightbits Labs

Overview on how LIghtbits LightOS improves latency of high performance low latency NVMe based storage accessed over standard TCP/IP network
Watch now »

REGISTER FOR YOUR FREE TICKET AND WIN SWAG

Virtual Event

October 19-20, 2022

Share on social with #p99conf and a link to p99conf.io for a chance to win $500.