Gunnar Morling, Senior Staff Software Engineer at Decodable, will be presenting “1BRC – Nerd Sniping the Java Community” at P99 CONF 24. As Gunnar so nicely put it:
“Your mission, should you decide to accept it, is the following: aggregate temperature values from a CSV file and group them by weather station name. There’s only one caveat: the file has one 1,000,000,000 rows!
This is the task of the “One Billion Row Challenge” which went viral within the Java community earlier this year. Come and join me for this talk where I’ll dive into some of the tricks employed by the fastest solutions for processing the challenge’s 13 GB input file within less than two seconds. Parallelization and efficient memory access, optimized parsing routines using SIMD and SWAR, as well as custom map implementations are just some of the topics which we are going to discuss.
I will also share some of the personal experiences and learnings which I made while running this challenge for and with the community.”
Note: P99 CONF is a technical conference on performance and low-latency engineering. It’s virtual, free, and highly interactive. This year’s agenda spans Rust, Zig, Go, C++, compute/infrastructure, Linux, Kubernetes, databases, and more.
We hope you’ll join us live October 23-24 to hear the talk and chat with Gunnar. In the meantime, let’s get to know a little about him!
How do you answer the dreaded “tell us about yourself” question?
I work as a software engineer at Decodable, where we build a real-time platform for ETL and stream processing, based on Apache Flink and Debezium. Before that, I was the project lead for Debezium, a distributed platform for change data capture for a variety of databases. I enjoy working at the intersection of core software engineering and outreach-related efforts, such as blogging and speaking at conferences. In my spare time, I like to write about all kinds of Java and data related topics on my blog morling.dev.
What’s the most interesting project that you’re working on right now – or hoping to start soon?
One thing I’d love to explore is running Apache Flink applications as native binaries via GraalVM, an ahead-of-time compiler for Java applications. This may yield some interesting advantages, for instance in terms of memory consumption. But it’s also a technically interesting task due to Flink’s dynamic nature.
What will you be talking about at P99 CONF?
I’ll talk about the One Billion Row Challenge (#1BRC or 1️⃣🐝🏎️ for short): a coding challenge that I ran for and with the Java community earlier this year. The task was to process one billion temperature values from a text file as quickly as possible. Folks took this way further than I ever would have expected and came up with some really amazing solutions for solving this seemingly simple task in less than two seconds (on eight CPU cores). It went kinda viral and folks implemented 1BRC solutions in all sorts of languages, data stores, and even tools like awk. I’ll explain some of the techniques used for getting there, discuss if and when you should use them — and also talk about the fantastic community which formed around this challenge.
What other P99 CONF talks are you most looking forward to – and why?
Oh, where to start? Some of the sessions I am really looking forward to include Matt Fleming’s talk on change point detection (something I’ve also been exploring in the context of the JfrUnit project), Tanel Poder’s session about database performance analysis with eBPF (database performance is always a hot topic), and Chris Riccomini’s talk on SlateDB, a new cloud-native LSM tree implementation (this could for instance be very interesting as a state store for Flink applications). And of course, I am also very excited about another talk on 1BRC: Shraddha Agrawal who is talking about her implementation of the challenge in Golang.
What do you like most about P99 CONF?
It’s the best-run online event I have attended so far! I love the deeply technical level of the event and the kind of cool “nerd” vibe it has. There’s no sales pitches or things like that, just folks with a shared interest coming together in order to learn and exchange.
Any performance-related resource recommendations for the P99 CONF community?
Folks interested in low-level arithmetic and bit-level algorithms should check out Hacker’s Delight by Henry S. Warren, Jr. I learned about this book from participants of 1BRC and immediately got a copy for myself. I can highly recommend it to anyone looking to optimize code at this level, perhaps it might come in handy for next year’s challenge too 😉