Cost-Effective Burst Scaling For Distributed Query Execution

Building a query engine that scales efficiently is a difficult task. Queries over big datasets stored in Object Storage require a large amount of IO and compute power. Keeping the latency of expensive queries acceptable when using a fixed size compute cluster is only possible when over-provisioning a cluster, while dynamically up- or downscaling is too slow for interactive queries.

To overcome these challenges, we built a distributed execution model which allows us to dynamically execute across both AWS Lambda and EC2 resources. With this model we can shed excess load to lambda functions to preserve low latency while we scale EC2 capacity to manage costs.

22 minutes
Register now to access all 50+ P99 CONF videos and slide decks.
Watch this session from the P99 CONF livestream, plus get instant access to all of the P99 CONF sessions and decks.

Dan Harris, Principal Software Engineer at Coralogix

Dan Harris is a Principal Software Engineer @Coralogix and Apache Arrow committer.

P99 CONF OCT. 23 + 24, 2024

Register for Your Free Ticket