Description:

Explore a technical presentation from Meta's Optical Engineer Andrew Alduino examining the critical role of optical interconnects in scaling AI infrastructure. Dive into the challenges of building large-scale GPU clusters like Meta's 24k GPU systems for LLama 3 training, focusing on how increasing AI workload demands impact IO requirements and accelerator package design. Learn about the limitations of electrical signaling solutions and discover how integrated optics offer promising alternatives with superior bandwidth capabilities. Understand the complex interplay between GPU hardware, system IO, rack design, power delivery, cooling technologies, and memory architectures in modern AI cluster development, with particular emphasis on optical interconnect optimization for next-generation AI infrastructure.

Optical Interconnects for Large-Scale AI Clusters - A Meta Perspective

Open Compute Project

Add to list

#Computer Science #Computer Architecture #Artificial Intelligence #Machine Learning #Natural Language Processing (NLP) #LLM (Large Language Model) #LLaMA (Large Language Model Meta AI) #High Performance Computing #Programming #Cloud Computing #Data Centers

0:00 / 0:00

Optical Interconnects for Large-Scale AI Clusters - A Meta Perspective

Optics in AI Clusters - Meta Perspective