Главная
Study mode:
on
1
Managing Chaos at Scale
Description:
Explore Uber's journey in maintaining reliability during explosive growth from a few to thousands of microservices in this 45-minute conference talk by Paweł Królikowski. Dive into incident prevention strategies, including integration testing, load testing, chaos testing, blackbox testing, and rollout strategies. Learn about effective incident response techniques, covering on-call procedures, monitoring systems, alerting mechanisms, and mitigation strategies. Gain insights into the benefits of using common frameworks in reliability engineering. Discover what Uber did right and the valuable lessons learned through experience in managing chaos at scale.

Managing Chaos at Scale

WeAreDevelopers
Add to list
0:00 / 0:00