Play all

Intro: ChatGPT, Language Models and the Goals of Generalist Robotics Policies

Reading and exploring the data

Creating a Dataset

Creating the transformer encoder

Creating image patches to tokenized

Putting together the VIT

Training the VIT

Making the GRP, starting with adding text inputs

Modifying the data for training

Converting continuous actions to discrete bins

Standardizing the state inputs

Changing to use continuous actions

Standizing the action space

Adding goal images to the transformer

Adding blocked masked attention to use either goal

Scaling training

Training results across A100s

Evaluation using the SimpleEnv robotics simulator

Description:

Dive into a comprehensive video tutorial on building Generalist Robotics Policies from scratch. Learn how to implement the "Octo: An Open-Source Generalist Robot Policy" model step-by-step, starting with basic transformer code and progressing to training the model using data from the open-x embodiment dataset. Explore topics such as data exploration, dataset creation, transformer encoder implementation, image patch tokenization, and Vision Transformer (ViT) construction. Discover techniques for incorporating text inputs, handling continuous and discrete actions, standardizing state inputs and action spaces, and integrating goal images into the transformer architecture. Gain insights into scaling training processes, analyzing results across A100 GPUs, and evaluating the model using the SimpleEnv robotics simulator. Access accompanying code, project details, and additional resources to enhance your understanding of Generalist Robotics Policies and their applications in the field of robotics. Read more

Building Generalist Robotics Policies from Scratch

Montreal Robotics

Add to list

#Engineering #Robotics #Computer Science #Machine Learning #Artificial Intelligence #Computer Vision #Neural Networks #Transformers #Data Science #Data Preprocessing #Vision Transformers

0:00 / 0:00