Главная
Study mode:
on
1
Intro to I-JEPA
2
Semantic Image Representations
3
Latent Representation
4
Invariance Based Pre-Training
5
Generative Pre-Training
6
What is I-JEPA
7
I-JEPA vs. Previous Approaches
8
ViT Method
9
Sampling Context and Targets
10
Prediction and Loss
11
Latent Space
12
Attention Head
13
Evaluation on Image Classification
14
Conclusion and Conversation
Description:
Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only! Grab it Explore a 40-minute technical video that breaks down the I-JEPA (Image Joint Embedding Predictive Architecture) paper, a collaborative research effort by Meta AI, McGill, Mila, and NYU focusing on non-generative self-supervised learning from images. Learn about semantic image representations, latent space concepts, and the fundamentals of invariance-based pre-training versus generative pre-training approaches. Understand the core mechanics of I-JEPA, its comparison with previous methodologies, and its implementation using Vision Transformer (ViT) architecture. Dive deep into technical aspects including context and target sampling, prediction and loss functions, latent space manipulation, and attention head mechanisms. Examine practical applications through image classification evaluation results, supported by references to related works like Masked Auto Encoder and comprehensive latent space diagrams. Access additional resources including the original paper, community discussions, and dataset implementations through provided links to Oxen.ai platform. Read more

Understanding I-JEPA: A Non-Generative Approach to Self-Supervised Learning from Images

Oxen
Add to list
0:00 / 0:00