Play all

00:00 - Intro

00:27 - Hyperstack GPUs platform! sponsored

02:08 - What is new in new Llama?

06:40 - Synthetic data

13:30 - Privacy - training on Facebook user data?

15:35 - Scaling and distillation

19:10 - MoE, new architectures?

25:35 - Upper boundary for the quality of SX data?

37:15 - Context length

45:10 - What framework does Meta use for Llama

46:40 - Playing with smaller Llamas

51:20 - Multilingual capabilities

Description:

Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only! Grab it Dive deep into the latest developments of LLaMA 3 in this informative video featuring Thomas Scialom from Meta. Explore key topics including synthetic data for pre/post training, privacy concerns, scaling and distillation techniques, and the decision not to use Mixture of Experts (MoE) architecture. Learn about the potential upper boundaries for synthetic data quality, context length improvements, Meta's framework choices, and the multilingual capabilities of smaller LLaMA models. Gain valuable insights into the cutting-edge advancements in large language models and their implications for AI research and development.

LLaMA 3 Deep Dive - Synthetic Data, Privacy, and Model Architecture

Aleksa Gordić - The AI Epiphany

Add to list

#Computer Science #Artificial Intelligence #Natural Language Processing (NLP) #LLM (Large Language Model) #LLaMA (Large Language Model Meta AI) #Machine Learning #Deep Learning #Mixture-of-Experts #Synthetic Data