Play all

Intro

Machine Learning

Graph Neural Networks

Stages of a Graph Neural Network

GPUs Are Not a Good Fit for Graph Operations

Combining CPUs and GPUs is Cost-Ineffective

Using Many CPU Servers Can Still Be Expensive

Key Insight: Serverless Fits Our Goals

Serverless Achieves Low-Cost, Scalable Efficiency

Challenges with Using Serverless

Challenge 1: Limited Resources

Solution: Computation Separation

Dorylus Architecture

Flow of Decomposed Tasks

Challenge 2: Limited Network

Solution: Create Pipeline of Decomposed Tasks

Data Chunks Moving Through Layer of Pipeline

Synchronize after Scatter Hinders Pipeline

Two Sync Points Makes Asynchrony Difficult

Minimizing Effects of Asynchrony on Convergence

Serverless Optimizations

Data Graphs

We Evaluated Several Aspects of Dorylus

High Value on Large-Sparse Graphs

Dorylus Outperforms Existing Systems

Dorylus Scales Full Graph Training

Conclusion: Dorylus Provides Value

Description:

Explore a cutting-edge distributed system for training Graph Neural Networks (GNNs) in this 15-minute conference talk from OSDI '21. Learn about Dorylus, an innovative approach that leverages serverless computing to overcome the challenges of expensive GPU servers and limited memory when working with billion-edge graphs. Discover how computation separation enables a deep, bounded-asynchronous pipeline that effectively hides network latency. Understand why CPU servers offer the best performance-per-dollar for large graphs and how integrating Lambda threads can significantly boost efficiency. Gain insights into Dorylus' architecture, its ability to scale GNN training, and its impressive performance compared to existing systems. Delve into the challenges of using serverless computing and the solutions implemented to address limited resources and network constraints.

Dorylus - Affordable, Scalable, and Accurate GNN Training with Distributed CPU Servers and Serverless Threads

USENIX

Add to list

#Conference Talks #OSDI (Operating Systems Design and Implementation) #Computer Science #Deep Learning #Software Engineering #Scalability #Distributed Computing #Programming #Cloud Computing #Serverless Computing