Building Low Latency ML Systems for Real-Time Model Predictions at Xandr

Xandr’s Ad-server handles over 400 billion daily ad requests from across the world wide web. Operating under a stringent Service Level Agreement (SLA), the majority of these requests are catered to within a 100-150 millisecond round-trip latency through an intricate ad auction process, each involving hundreds of competing advertisers. Key stages in this process, such as audience targeting, optimization of advertiser objectives, and ad selection are executed utilizing an assortment of sophisticated ML algorithms. Inferencing ML models in real-time and rendering predictions at such an unparalleled scale under the precise SLAs of an ad auction necessitates a resilient and prompt machine learning system.

In this session, I will discuss the challenges of building such a machine learning system that is characterized by low latency to support the high volume and high throughput demands of ad serving. I will cover how Xandr built an extensible, scalable system to supply real-time predictions integral to the ad auction process, leveraging ML models trained frequently on large amounts of constantly updating ad transaction data. I will also share the lessons learned from building such systems, including how to optimize performance, reduce latency, and ensure reliability.

26 minutes
Register now to access all 50+ P99 CONF videos and slide decks.
Watch this session from the P99 CONF livestream, plus get instant access to all of the P99 CONF sessions and decks.

Chinmay Abhay Nerurkar, Principal Engineer at Microsoft

Chinmay Abhay Nerurkar is a Principal Engineer at Microsoft Inc. He holds a Masters's degree in Electrical Engineering from New York University and has over 14 years of diverse experience working in embedded hardware/software, digital video processing, FinTech, and the Ad-tech industry. He is currently focused on building impactful products for Microsoft Advertising harnessing the power of big data and AI/ML. He is interested in behavioral finance, economics, and contextual data analysis using NLP and artificial intelligence.

P99 CONF OCT. 23 + 24, 2024

Register for Your Free Ticket