Play all

- Intro

- Paper Intro

- Training recipe overview

- Image and Video generation pipeline

- Temporal Auto-Encoder architecture

- Transformer backbone architecture

- Training ObjectiveOutlier Penalty Loss

- Tiled inference

- Superresolution model

- Training setting

- Parallelism for training

- Pre-training data

- Multi-stage training

- Fine-tuning

- Inference

- Extro

Description:

Explore a 23-minute technical video analysis breaking down Meta's latest video generation model, MovieGen, and its research paper. Dive deep into the model's sophisticated architecture, training methodology, and ambitious goals for AI-powered content creation. Learn about key components including the temporal auto-encoder architecture, transformer backbone, training objectives with outlier penalty loss, and the complete pipeline from pre-training through inference. Understand the technical intricacies of tiled inference, superresolution modeling, parallelism in training, and the multi-stage training approach. Presented by an experienced machine learning researcher with 15 years of software engineering background and expertise in computer vision and robotics, gain valuable insights into how this cutting-edge technology aims to revolutionize movie generation through artificial intelligence.

Meta MovieGen: Understanding Video Generation Model Architecture and Training

AI Bites

Add to list

#Computer Science #Machine Learning #Deep Learning #Artificial Intelligence #Computer Vision #Neural Networks #Transformers #Model Training

0:00 / 0:00