Play all

Transformers are taking over AI right now, and quite possibly their most famous use is in ChatGPT. ChatGPT uses a specific type of Transformer called a Decoder-Only Transformer, and this StatQuest sh…

Awesome song and introduction

Word Embedding

Position Encoding

Masked Self-Attention, an Autoregressive method

Residual Connections

Generating the next word in the prompt

Review of encoding and generating the prompt

Generating the output, Part 1

Masked Self-Attention while generating the output

Generating the output, Part 2

Normal Transformers vs Decoder-Only Transformers

Description:

Dive into a comprehensive 37-minute video tutorial exploring Decoder-Only Transformers, the specific type of Transformer used in ChatGPT. Learn about word embedding, position encoding, masked self-attention as an autoregressive method, and residual connections. Understand the process of generating the next word in a prompt, encoding and generating prompts, and the two-part output generation process. Compare Normal Transformers with Decoder-Only Transformers, and gain insights into the inner workings of cutting-edge AI technology. Supplementary resources for deeper understanding of related concepts like backpropagation, SoftMax function, and word embedding are also provided.

Decoder-Only Transformers, ChatGPT's Specific Transformer, Clearly Explained

StatQuest with Josh Starmer

Add to list

#Computer Science #Artificial Intelligence #Natural Language Processing (NLP) #LLM (Large Language Model) #ChatGPT #Machine Learning #Word Embeddings

0:00 / 0:00