Play all

Welcome and Link to Colab Notebook

Encoder versus Decoder Architectures

What is the GPT-4o architecture?

Recap of transformer for weather prediction

Pre layer norm versus post layer norm

RoPE vs Sinusoidal Positional Embeddings

Dummy Data Generation

Transformer Architecture Initialisation

Forward pass test

Training loop setup and test on dummy data

Weather data import

Training and Results Visualisation

Can the model predict the weather?

Is volatility in the loss graph a problem?

How to improve the model further?

Description:

Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only! Grab it Dive into the second part of a comprehensive video series on building transformers from scratch. Explore the differences between encoder and decoder architectures, understand the GPT-4o architecture, and revisit the transformer model for weather prediction. Learn about pre-layer norm versus post-layer norm, and compare RoPE with sinusoidal positional embeddings. Follow along as dummy data is generated, the transformer architecture is initialized, and a forward pass test is conducted. Set up and test a training loop on dummy data before importing real weather data. Visualize training results and evaluate the model's weather prediction capabilities. Discuss the implications of loss graph volatility and explore strategies for further model improvement. Access additional resources and a Colab notebook to enhance your learning experience.

Transformers from Scratch - Part 2: Building and Training a Weather Prediction Model

Trelis Research

Add to list

#Computer Science #Machine Learning #Transformers #Artificial Intelligence #Neural Networks #Encoder-Decoder Architecture #Deep Learning #Positional Encoding

0:00 / 0:00