Explore the intricacies of managing large-scale Kafka clusters using Cruise Control at LinkedIn in this comprehensive 57-minute video presentation. Delve into the operational challenges faced by growing cluster sizes, increasing traffic volume, and aging infrastructure components. Learn how Cruise Control addresses these issues through its architecture and functionality. Follow along with a practical tutorial demonstrating real-world Kafka cluster management techniques. Gain insights into dynamic load balancing, anomaly detection, and self-healing actions. Understand the importance of metrics reporting, cluster modeling, and proposal generation in maintaining optimal performance and availability. Discover how LinkedIn leverages Cruise Control to mitigate cascading failures and reduce management overhead. Engage with the Q&A session to deepen your understanding of Kafka infrastructure management at scale.
How LinkedIn Navigates Streams Infrastructure Using Cruise Control - Adem Efe Gencer, PhD