AI and ML: Let’s Talk About the Boring (yet Critical!) Operational Side- Rob Koch & Milad Vafaeifard
Description:
Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Grab it
Explore the operational challenges and solutions for running AI and ML applications in this 28-minute conference talk from CNCF. Dive into critical aspects of managing compute resources, GPU workloads, and maintaining reliability while ensuring proper dataset separation and training process isolation. Learn how implementing a service mesh can address real-world ML application challenges, streamline operations, and enhance observability. Follow along as Principal Rob Koch demonstrates practical implementations using Linkerd with multiple Kubernetes clusters, covering essential topics like IPv6 integration, GPU utilization, multitenancy considerations, and scaling strategies for ML deployments.
AI and ML: The Critical Operational Side of Running Applications in Kubernetes