Play all

Intro to I-JEPA

Semantic Image Representations

Latent Representation

Invariance Based Pre-Training

Generative Pre-Training

What is I-JEPA

I-JEPA vs. Previous Approaches

ViT Method

Sampling Context and Targets

Prediction and Loss

Latent Space

Attention Head

Evaluation on Image Classification

Conclusion and Conversation

Description:

Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only! Grab it Explore a 40-minute technical video that breaks down the I-JEPA (Image Joint Embedding Predictive Architecture) paper, a collaborative research effort by Meta AI, McGill, Mila, and NYU focusing on non-generative self-supervised learning from images. Learn about semantic image representations, latent space concepts, and the fundamentals of invariance-based pre-training versus generative pre-training approaches. Understand the core mechanics of I-JEPA, its comparison with previous methodologies, and its implementation using Vision Transformer (ViT) architecture. Dive deep into technical aspects including context and target sampling, prediction and loss functions, latent space manipulation, and attention head mechanisms. Examine practical applications through image classification evaluation results, supported by references to related works like Masked Auto Encoder and comprehensive latent space diagrams. Access additional resources including the original paper, community discussions, and dataset implementations through provided links to Oxen.ai platform. Read more

Understanding I-JEPA: A Non-Generative Approach to Self-Supervised Learning from Images

Oxen

Add to list

#Computer Science #Artificial Intelligence #Computer Vision #Machine Learning #Deep Learning #Neural Networks #Image Classification #Self-supervised Learning #Attention Mechanisms #Vision Transformers #Latent Space

0:00 / 0:00