Главная
Study mode:
on
1
Optimize Your AI Cloud Infrastructure: A Hardware Perspective - Liang Yan, CoreWeave
Description:
Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only! Grab it Explore the intricacies of GPU Cloud infrastructure optimization in this technical conference talk that delves deep into hardware-level considerations for AI systems. Learn how to fine-tune various machine learning models using an H100 Cluster, with detailed analysis of critical components like POD Scheduler, Device Plugin, GPU/NUMA topology, and ROCE/NCCL Stack. Gain valuable insights from first-hand experimental results demonstrating the relationship between model performance and device operator configurations in nodes, focusing particularly on CNN, RNN, and Transformer models from MLPerf. Master the often-overlooked hardware aspects of AI infrastructure that can significantly impact distributed machine learning performance and efficiency.

Optimize Your AI Cloud Infrastructure: A Hardware Perspective

Linux Foundation
Add to list
00:00
-40:11