#GPU Computing

Dive into an extensive tutorial on synthetic data generation and fine-tuning techniques for large language models like OpenAI GPT-4o and Llama 3. Learn how to create synthetic questions and answers, implement chain of thought reasoning, and augment data…

Add to list

42 Lesons

1 hour 18 minutes

On-Demand

Free-Video

Devoxx

Running Open Large Language Models in Production with Ollama and Serverless GPUs

Explore the deployment of open large language models in production environments using Ollama and serverless GPUs. Learn why companies are increasingly interested in running open models like Gemma and Llama, which offer full control over deployment, model…

Add to list

1 Lesons

43 minutes

On-Demand

Free-Video

Sustainable Development of Stencil-based HPC Applications - JuliaCon 2024

Explore sustainable HPC application development using Julia packages for stencil-based computations, achieving scalable performance and portability across diverse hardware architectures.

Add to list

1 Lesons

11 minutes

On-Demand

Free-Video

NeuroTree - A Differentiable Tree Operator for Tabular Data

Explore NeuroTree, a differentiable tree operator for tabular data, addressing traditional tree limitations through simultaneous node learning and built-in ensemble techniques.

Add to list

1 Lesons

12 minutes

On-Demand

Free-Video

Training Llama 2 in Julia - Scaling Large Language Models

Explore scaling techniques for training large neural networks like Llama2 using Julia, including multi-GPU parallelization and fine-tuning methods for efficient model adaptation.

Add to list

1 Lesons

31 minutes

On-Demand

Free-Video

Data Versioning in Generative AI: A Pathway to Cost-Effective ML

Explore data versioning strategies for Generative AI, focusing on managing large-scale unstructured data and optimizing resource usage in model scoring and API interactions.

Add to list

1 Lesons

27 minutes

On-Demand

Free-Video

Training LLMs: Lessons from the Trenches

Discover key insights and challenges in training large language models across hundreds of GPUs, from data handling to resource management.

Add to list

1 Lesons

30 minutes

On-Demand

Free-Video

From Idea to Production: AI Infrastructure for Scaling LLM Applications

Explore strategies for scaling LLM applications from beta to production, addressing challenges and building adaptable AI infrastructure for evolving models and workflows.

Add to list

1 Lesons

38 minutes

On-Demand

Free-Video

Data Versioning in Generative AI - A Pathway to Cost-effective Machine Learning

Explore data versioning strategies for generative AI, focusing on cost-effective techniques to minimize processing time and API calls while enhancing collaboration and efficient versioning of annotations and embeddings.

Add to list

1 Lesons

35 minutes

On-Demand

Free-Video

Scale and Accelerate Distributed Model Training in Kubernetes Clusters

Orchestrate and accelerate distributed deep learning workloads across multiple GPUs and nodes using Kubernetes, Kubeflow PytorchJob, RDMA, RoCE, and SRIOV for near-linear performance scaling.

Add to list

1 Lesons

49 minutes

On-Demand

Free-Video

MLOps is Just HPC in Disguise - A Real-World, No-Nonsense Guide to Upgrading Your Workflow

Upgrade your ML workflow with efficient tools and techniques for automating tasks, managing resources, and streamlining production deployment in this practical guide to MLOps.

Add to list

1 Lesons

1 hour 27 minutes

On-Demand

Free-Video

Critical Use of MLOps in Finance Using Cloud-Managed ML Services That Scale

Leverage MLOps to improve quality and consistency of machine learning solutions in finance. Accelerate model experimentation, deployment, and tracking using cloud-managed services for efficient scaling.

Add to list

1 Lesons

38 minutes

On-Demand

Free-Video

WarpDrive: Orders of Magnitude Faster Multi-Agent Deep Reinforcement Learning on a GPU

Accelerate multi-agent deep reinforcement learning with WarpDrive, an open-source framework enabling end-to-end GPU execution. Explore features like automatic GPU tuning and distributed training for improved efficiency and scalability.

Add to list

1 Lesons

56 minutes

On-Demand

Free-Video

Anyscale

Faster Model Serving with Ray and Anyscale - Ray Summit 2024

Explore Ray Serve's capabilities for distributed model serving and deployment, focusing on Anyscale's solutions for large-scale AI models and the challenges of building AI applications in the era of generative AI.

Add to list

1 Lesons

31 minutes

On-Demand

Free-Video

OpenCV DNN

Learn to deploy neural networks for object detection, depth estimation, and point cloud creation using OpenCV DNN module with Python and C++. Explore GPU acceleration and custom YOLOv5 models.

Add to list

11 Lesons

3 hours 30 minutes

On-Demand

Free-Video

Pytorch Tutorial

Learn PyTorch fundamentals, from installation to advanced applications. Master tensors, backpropagation, and deep learning techniques for real-world data science projects.

Add to list

6 Lesons

3 hours 30 minutes

On-Demand

Free-Video

Julia for High-Performance Computing - JuliaCon 2022

Explore Julia's capabilities for high-performance computing through expert talks on scalability, GPU support, MPI integration, and real-world applications in scientific and engineering domains.

Add to list

13 Lesons

2 hours 20 minutes

On-Demand

Free-Video

TensorFlow

Distributed TensorFlow Training - Google I/O 2018

Learn to scale TensorFlow training across multiple GPUs and machines efficiently. Explore distribution strategies, performance optimization techniques, and practical implementation using Google Cloud.

Add to list

23 Lesons

35 minutes

On-Demand

Free-Video

Anyscale

The Evolution of Multi-GPU Inference in vLLM

Dive into advanced parallelism strategies for distributed LLM inference, exploring tensor, pipeline, and expert parallelism techniques through vLLM's architecture and optimization approaches.

Add to list

1 Lesons

31 minutes

On-Demand

Free-Video

SeismicWaves.jl - Wave Simulation Package for Full Waveform Inversion on Multi-xPUs

Discover how SeismicWaves.jl enables efficient wave equation solving through finite-difference methods, offering high-performance computing capabilities for both small-scale prototyping and large-scale simulations.

Add to list

1 Lesons

20 minutes

On-Demand

Free-Video

Improving the Life-Cycle of Tensor Algorithm Development with ITensors

Explore advanced tensor algorithm development in Julia, focusing on generic programming strategies for efficient computation across CPU, GPU, and QPU architectures using the ITensors framework.

Add to list

1 Lesons

31 minutes

On-Demand

Free-Video

Seamless Transition from Single-core Python to Julia Multi-GPU

Discover how to transform single-core Python code into scalable Julia applications for GPU supercomputers, with focus on automatic verification and parallel computing optimization.

Add to list

1 Lesons

21 minutes

On-Demand

Free-Video

Practice of Building AI Training Clusters Based on Kubernetes and RoCEv2

探讨基于Kubernetes和RoCEv2构建AI训练集群的实践,包括网络集成、资源调度和任务优化等方面的挑战与解决方案,助力提升AI训练效率。

Add to list

1 Lesons

42 minutes

On-Demand

Free-Video

QCHEM LAB at KNU

DFT on Expensive GPU-based Supercomputer

Explore advanced density functional theory computations using high-performance GPU supercomputers for cutting-edge quantum chemistry research.

Add to list

1 Lesons

25 minutes

On-Demand

Free-Video

The Machine Learning Engineer

AirLLM: Inferencia de LLM de 70 Billones de Parámetros en GPU de 4GB - Español

Aprende a realizar inferencia con un LLM de 70 billones de parámetros en una GPU doméstica de 4GB usando AirLLM. Explora técnicas avanzadas de machine learning y optimización de recursos.

Add to list

1 Lesons

23 minutes

On-Demand

Free-Video

The Machine Learning Engineer

Acelere su Código Python con Numba y Cupy

Acelera tu código Python utilizando Numba y Cupy: compila y mueve a la GPU para optimizar el rendimiento en machine learning y ciencia de datos.

Add to list

1 Lesons

20 minutes

On-Demand

Free-Video

The Machine Learning Engineer

Rapids Machine Learning Training y Feature Engineering en tu GPU con CuML y CuDF

Acelera el entrenamiento de modelos y la ingeniería de características utilizando GPU con CuML y CuDF para optimizar tus flujos de trabajo de aprendizaje automático.

Add to list

1 Lesons

47 minutes

On-Demand

Free-Video

Nvidia

Creating Intelligent Machines with the Isaac SDK - NVIDIA Webinar

Learn to program, simulate, and deploy robotic applications using NVIDIA's Isaac SDK. Covers codelets, compute-graphs, simulation testing, and Jetson deployment, with hands-on examples and Q&A.

Add to list

28 Lesons

1 hour 2 minutes

On-Demand

Free-Video

MFEM Workshop 2023 - Recent Developments in Modular Finite Element Methods

Explore recent advancements in MFEM, including GPU optimizations, new mini-apps, and integrations with scientific software libraries for high-order finite element calculations.

Add to list

1 Lesons

28 minutes

On-Demand

Free-Video

BSidesLV

Passwords 101 - History, Cracking Techniques, and Defenses

Explore password hashing history, cracking techniques, and defenses. Learn about tools, wordlists, custom rules, and CPU vs GPU tradeoffs in this comprehensive overview of password security.

Add to list

1 Lesons

41 minutes

On-Demand

Free-Video

ADC - Audio Developer Conference

Accelerated Audio Computing: From Problem to Solution

Explore accelerated audio computing advancements: technology progress, use cases, SDK vision, and industry standards. Engage in shaping the future of GPU-powered audio processing.

Add to list

1 Lesons

41 minutes

On-Demand

Free-Video

OffensiveCon

Password Cracking: Past, Present, Future - Keynote

Explore the evolution and future of password cracking techniques with insights from a renowned security expert. Gain valuable knowledge on cybersecurity challenges and solutions.

Add to list

1 Lesons

1 hour 12 minutes

On-Demand

Free-Video

The Netgen/NGSolve Finite Element Software - Overview and Applications

Explore Netgen/NGSolve finite element software: overview, Python scripting, teaching applications, and recent developments in matrix-valued elements, GPU utilization, and time-dependent problems.

Add to list

12 Lesons

1 hour 29 minutes

On-Demand

Free-Video

The State of MFEM - Workshop 2022

Explore MFEM's high-order mathematical calculations for scientific simulations, including key capabilities, examples, and future plans. Learn about the project's growth and recent software release.

Add to list

10 Lesons

16 minutes

On-Demand

Free-Video

Palace: Parallel Large-scale Computational Electromagnetics - MFEM Workshop 2023

Explore Palace, a parallel finite element code for full-wave electromagnetics simulations, and its capabilities for large-scale 3D modeling in quantum computing hardware design.

Add to list

1 Lesons

22 minutes

On-Demand

Free-Video

The State of MFEM - Workshop 2023

Explore MFEM's key capabilities, recent developments, and future directions in high-order finite element methods for scientific simulations.

Add to list

1 Lesons

24 minutes

On-Demand

Free-Video

Recent Developments in MFEM - Workshop 2022

Explore recent developments in MFEM, including sub-mesh extraction, GPU-based calculations, and integrations with scientific software libraries for high-order finite element simulations.

Add to list

9 Lesons

28 minutes

On-Demand

Free-Video

Advancing GPU Analytics with RAPIDS Accelerator for Apache Spark and Alluxio

Learn to accelerate GPU analytics using RAPIDS for Apache Spark and Alluxio, enabling faster data access and processing for AI and analytics workloads across any cloud environment.

Add to list

15 Lesons

27 minutes

On-Demand

Free-Video

A Collaborative Data Science Development Workflow Using Kedro and MLflow

Explore a collaborative data science workflow using Kedro pipelines, MLflow tracking, and cloud-agnostic GPU containers for efficient model development, training, and production deployment.

Add to list

13 Lesons

24 minutes

On-Demand

Free-Video

Accelerating Apache Spark with GPUs Using RAPIDS

Explore GPU acceleration for data science with RAPIDS, offering significant performance gains over CPU-only solutions. Learn its functionality, architecture, and integration with Apache Spark for large-scale computation.

Add to list

10 Lesons

36 minutes

On-Demand

Free-Video

Docker

Depend Upon Docker for AI

Explore Docker's role in Data Science, Machine Learning, and AI, including GPU processing techniques and Industrial AI demos, presented by experts from Baker Hughes.

Add to list

1 Lesons

25 minutes

On-Demand

Free-Video

The ASF

Inference at Scale with Apache Beam

Learn to deploy and scale machine learning models efficiently using Apache Beam for distributed inference on CPUs and GPUs, with practical insights on parallelizing workloads.

Add to list

1 Lesons

34 minutes

On-Demand

Free-Video

Toronto Machine Learning Series (TMLS)

Reaching Lightspeed Data Science - ETL, ML, and Graph with NVIDIA RAPIDS

Accelerate data science workflows using GPUs and RAPIDS. Learn performance gains, easy migration, and new possibilities in ETL, ML, and graph analytics.

Add to list

1 Lesons

1 hour 19 minutes

On-Demand

Free-Video

Jeff Heaton

Kaggle Notebooks - Introduction to Cloud-Based Data Science Platforms

Explore Kaggle notebooks for seamless data access, free GPU usage, and efficient machine learning model development without complex local setups.

Add to list

1 Lesons

13 minutes

On-Demand

Free-Video

Labroots

A Spatially-Resolved Transcriptional Atlas of the Murine Dorsal Pontine Tegmentum

Explore a groundbreaking transcriptional atlas of the murine dorsal pontine tegmentum using snRNA-seq and MERFISH, with interactive visualization for real-time data analysis and insights.

Add to list

1 Lesons

40 minutes

On-Demand

Free-Video

Computational Genomics Summer Institute CGSI

Acceleration of Bioinformatics Workloads Through Hardware-Algorithm Co-Design and Processing-in-Memory Technologies

Explore hardware-algorithm co-design and processing-in-memory technologies for accelerating bioinformatics workloads, focusing on genomic sequence analysis and mapping techniques.

Add to list

1 Lesons

31 minutes

On-Demand

Free-Video

PyCon US

NetworkX is Fast Now - Graph Analytics Unleashed

Discover NetworkX's power for graph analytics, from beginner-friendly API to accelerated backends. Learn to extract insights from data using advanced tools for scalable graph analysis.

Add to list

1 Lesons

28 minutes

On-Demand

Free-Video

PyCon US

NetworkX is Fast Now - Graph Analytics Unleashed

Discover NetworkX's power for graph analytics, from beginner-friendly API to accelerated backends. Learn to extract insights from data using advanced tools for scalable graph analysis.

Add to list

1 Lesons

28 minutes

On-Demand

Free-Video

Simulation Lab

TyFlow Tutorial - CUDA Cloth Bubbles + Project Files

Explore 3ds Max's TyFlow CUDA cloth features to simulate precise cloth collisions, creating glass bulbs with procedural bubbles for visually satisfying results.

Add to list

1 Lesons

35 minutes

On-Demand

Free-Video

A Fast and Flexible CFD Solver with Heterogeneous Execution - JuliaCon 2024

Explore WaterLily.jl: a fast, flexible CFD solver for CPUs and GPUs. Learn about its meta-programming approach, performance, and potential for ML integration and differentiability.

Add to list

1 Lesons

22 minutes

On-Demand

Free-Video

Center For Computational Relativity and Gravitation

CarpetX and the Future of the Einstein Toolkit

Explore CarpetX and its impact on the Einstein Toolkit's future in computational relativity and gravitation research with Erik Schnetter.

Add to list

1 Lesons

36 minutes

On-Demand

Free-Video

Pushing Infrastructure Boundaries: Memory Requirements in AI Data Centers

Explore how AI applications are transforming hyperscale data centers, focusing on memory innovations and infrastructure adaptations needed for GPU-centric computing environments.

Add to list

1 Lesons

20 minutes

On-Demand

Free-Video

Innovating the Korea Meteorological Administration with OpenStack and Kubernetes

Discover how South Korea's weather services leverage OpenStack and Kubernetes to process real-time meteorological data, featuring a scalable cloud infrastructure managing 30,000 vCPUs and 7PB storage across two sites.

Add to list

1 Lesons

31 minutes

On-Demand

Free-Video

Better Sharing of AI Accelerators in OpenStack with Blazar Reservations

Discover how Blazar enhances GPU resource sharing in OpenStack environments through advanced reservation capabilities, optimizing AI workload management and infrastructure utilization.

Add to list

1 Lesons

47 minutes

On-Demand

Free-Video

Cloud GPU Providers Comparison - From H100 to RTX 4090

Explore 8 cloud GPU providers offering various options from RTX 3060 to H100 GPUs, comparing features and pricing for machine learning and LLM fine-tuning projects.

Add to list

6 Lesons

13 minutes

On-Demand

Free-Video

Hussein Nasser

Meet MySQL RAPID - Distributed, In-Memory, Columnar, Query Processing Engine by ORACLE

Explore MySQL RAPID: Oracle's distributed, in-memory columnar engine for OLAP & OLTP workloads. Learn its architecture, data loading, and how it revolutionizes data warehousing and ETL processes.

Add to list

7 Lesons

20 minutes

On-Demand

Free-Video

Oracle

Building a High Performance Network in the Public Cloud Using RDMA - First Principles

Explore Oracle Cloud Infrastructure's high-performance networking using RDMA, covering challenges, optimizations, and unique features for low-latency workloads in public cloud environments.

Add to list

20 Lesons

40 minutes

On-Demand

Free-Video

Linux Foundation

CUTLASS: A CUDA C++ Template Library for Accelerating Deep Learning Computations

Explore CUTLASS, an open-source CUDA C++ template library for optimizing deep learning computations on NVIDIA GPUs. Learn to develop high-performance custom kernels using Tensor Core programming.

Add to list

1 Lesons

35 minutes

On-Demand

Free-Video

TornadoVM: Heterogeneous Programming Framework for Java

Explore TornadoVM, a Java framework for heterogeneous computing on CPUs, GPUs, and FPGAs. Learn its API, compiler, and runtime system for optimized performance across diverse hardware.

Add to list

14 Lesons

41 minutes

On-Demand

Free-Video

Data Parallel Programming - Concepts and Challenges in Java

Explore data parallel computing techniques, array operations, and Java VM virtualization. Learn about multi-threading challenges, data partitioning strategies, and GPU thread concepts.

Add to list

26 Lesons

51 minutes

On-Demand

Free-Video

Nvidia

CUDA Toolkit 12.2 - New Accelerated Computing and Security Enhancements

Explore CUDA 12.2's new features: Hopper GPU support, confidential computing, heterogeneous memory management, and MPS prioritization enhancements for efficient, secure, and accelerated application development.

Add to list

7 Lesons

1 hour 7 minutes

On-Demand

Free-Video

Confidential Containers for GPU Compute - Incorporating LLMs in AI Lift-and-Shift Strategy

Explore confidential containers for GPU-accelerated AI, integrating LLMs and Kubernetes for secure, high-performance computing while maintaining data privacy in cloud-native environments.

Add to list

1 Lesons

32 minutes

On-Demand

Free-Video

USENIX

Overcoming Challenges in Serving Large Language Models - SREcon23 Europe/Middle East/Africa

Explore hosting GPT models in Kubernetes, covering GPU sharding, tensor parallelism, and model optimization. Learn trade-offs between latency, accuracy, and resource allocation, with a live demo showcasing performance.

Add to list

1 Lesons

31 minutes

On-Demand

Free-Video

GDC

From Battlegrounds to Fairways - Terrain Procedural Tools in Frostbite

Explore Frostbite's innovative terrain procedural framework, enabling efficient creation of diverse game environments through automated workflows and GPU-based compositing for faster artist iteration.

Add to list

1 Lesons

1 hour 2 minutes

On-Demand

Free-Video

Anyscale

Ray on Kubernetes - Powering Quantitative Research at Scale

Discover how Ray on Kubernetes revolutionizes quantitative research with scalable infrastructure supporting 100,000 CPU cores, spot instances, and seamless GPU integration for high-performance trading operations.

Add to list

1 Lesons

13 minutes

On-Demand

Free-Video

Kata Containers 4.0 - Full Lifecycle GPU Management for AI/ML Workloads

Discover how Kata Containers 4.0 revolutionizes GPU management for AI/ML workloads through CDI framework integration, enhanced security features, and streamlined resource allocation in Kubernetes environments.

Add to list

1 Lesons

34 minutes

On-Demand

Free-Video

X.Org Foundation

Improving the World's Slowest Raytracer - RADV Implementation Progress

Dive into the evolution of RADV's raytracing implementation, exploring performance improvements, compatibility expansions, and current capabilities in Mesa 23.2's graphics rendering system.

Add to list

1 Lesons

31 minutes

On-Demand

Free-Video

Multi-Instance GPU Deployment for Machine Learning and Particle Beam Simulations at CERN

Explore CERN's implementation of Multi Instance GPU technology using NVIDIA A100s, covering deployment strategies, virtualization approaches, and practical applications in machine learning and particle physics.

Add to list

17 Lesons

28 minutes

On-Demand

Free-Video

Sokovan - Container Orchestrator for Accelerated AI/ML Workloads and Massive Scale GPU Computing

Discover how Sokovan optimizes AI/ML workloads through intelligent container orchestration, featuring dual-layer scheduling and hardware acceleration integration for enhanced GPU computing performance.

Add to list

1 Lesons

28 minutes

On-Demand

Free-Video

Data Platform for End-to-End AI Democratization

Explore how data platforms enable AI democratization through unified processing and training architectures, improving efficiency for recommender systems and deep learning workloads.

Add to list

1 Lesons

47 minutes

On-Demand

Free-Video

Storage Requirements for AI Infrastructure - From Data Preparation to Inference

Delve into the crucial role of storage systems across the AI lifecycle, from data preparation to inference, exploring performance requirements and infrastructure considerations for optimal GPU utilization.

Add to list

1 Lesons

19 minutes

On-Demand

Free-Video

Data Science Conference

Demolishing AI Gatekeepers - Breaking Down Barriers in AI Development

Explore how Web3 and decentralization can break Big Tech's monopoly on AI development by democratizing access to training data and GPU resources for unrestricted innovation.

Add to list

1 Lesons

18 minutes

On-Demand

Free-Video

Software Engineering Courses - SE Courses

Stable Diffusion 3.5 Large Tutorial - Configuration Guide and FLUX DEV Comparison

Master the optimal configuration and usage of Stable Diffusion 3.5 Large and FLUX DEV through SwarmUI, including model comparisons, LoRA implementation, and deployment across various computing platforms.

Add to list

20 Lesons

36 minutes

On-Demand

Free-Video

X.Org Foundation

Rusticl OpenCL Implementation Status Update 2023

Discover the latest developments and implementations in Rusticl, exploring project updates and future outlook for this OpenCL implementation in Rust.

Add to list

1 Lesons

14 minutes

On-Demand

Free-Video

Mixture of Experts (MoE) in Large Language Models - A Simple Guide

Delve into the architecture and functionality of Mixture of Experts (MoE) systems in language models through clear examples and technical insights about token routing, gating networks, and computational efficiency.

Add to list

1 Lesons

23 minutes

On-Demand

Free-Video

Mistral 7B: Architecture, Performance and Implementation Guide

Dive into Mistral 7B's implementation, exploring grouped-query attention, efficient GPU usage, and hands-on demonstrations for running advanced language models on local hardware.

Add to list

1 Lesons

42 minutes

On-Demand

Free-Video

Storage Architecture Optimized for AI Workloads - High Performance Solutions for GPU Clusters

Discover how to optimize storage architecture for AI workloads, exploring high-performance solutions, capacity-optimized tiers, and real-world implementations for machine vision applications.

Add to list

1 Lesons

21 minutes

On-Demand

Free-Video

AI Bites

QLoRA: Efficient Training of Large Language Models Using Quantization and Low-Rank Adaptation

Discover how QLoRA enables efficient training of large language models on a single GPU through innovative techniques like NormalFloat, Double Quantization, and Paged Optimizers.

Add to list

10 Lesons

12 minutes

On-Demand

Free-Video

Fine-tuning Flan-T5 LLM with HuggingFace Accelerate - Tutorial

Learn how to fine-tune Flan-T5 LLM using HuggingFace Accelerate for multi-GPU environments, with step-by-step guidance on preprocessing and implementation for specific downstream tasks.

Add to list

8 Lesons

12 minutes

On-Demand

Free-Video

Fine-tune T5 Language Model for Text Summarization on Google Colab

Master fine-tuning techniques for T5 language models using HuggingFace's latest code, focusing on text summarization tasks with hands-on implementation in Google Colab's free GPU environment.

Add to list

1 Lesons

15 minutes

On-Demand

Free-Video

ChatGPT vs Flan-T5: Comparing Proprietary and Free LLMs with Performance Tuning

Explore the differences between ChatGPT and Flan-T5 LLM models, comparing their capabilities, accessibility, and performance optimization through practical demonstrations and hyperparameter tuning techniques.

Add to list

1 Lesons

12 minutes

On-Demand

Free-Video

Running BLOOM 176B LLM Inference with AWS ML and DeepSpeed

Dive into advanced LLM inference deployment using AWS's ml.p4de.24xlarge instance, exploring BLOOM 176B implementation with DeepSpeed and DJL for optimal performance and scalability.

Add to list

4 Lesons

27 minutes

On-Demand

Free-Video

Optimizing LLM Fine-Tuning with PEFT and LoRA Adapter-Tuning for GPU Performance

Master efficient fine-tuning of large language models using PEFT and LoRA techniques, optimizing GPU memory usage through INT8 quantization and adapter-tuning for cost-effective model development.

Add to list

15 Lesons

35 minutes

On-Demand

Free-Video

Parameter-Efficient Fine-Tuning with LoRA - Optimizing LLMs for Local GPU Training

Master PEFT LoRA techniques to efficiently fine-tune large language models on local GPUs through low-rank adaptation, reducing memory requirements while maintaining model performance.

Add to list

17 Lesons

41 minutes

On-Demand

Free-Video

Experimental KERAS Feature Space for Medical AI Heart Attack Prediction

Dive into advanced feature scaling with KERAS Feature Space, building a medical AI system to predict heart attack probabilities using structured data classification and optimized data pipelines.

Add to list

6 Lesons

36 minutes

On-Demand

Free-Video

The Full Stack

Infrastructure and Tooling - Full Stack Deep Learning - March 2019

Explore infrastructure and tooling for deep learning projects, covering GPU machines, experiment tracking, and all-in-one solutions like AWS SageMaker.

Add to list

1 Lesons

1 hour 14 minutes

On-Demand

Free-Video

Efficient AutoML with Ludwig, Ray, and Nodeless Kubernetes

Explore efficient AutoML using Ludwig and Ray, leveraging heuristics for model production. Learn how Nodeless Kubernetes optimizes GPU resource management, reducing costs in cloud environments.

Add to list

1 Lesons

28 minutes

On-Demand

Free-Video

Kueue: Kubernetes-Native Job Queueing for Batch Workloads

Explore Kueue, a Kubernetes-native job queueing solution for efficient resource management and fair sharing in batch workloads, addressing limitations in pod-centric scheduling.

Add to list

36 Lesons

35 minutes

On-Demand

Free-Video

TensorFlow

TensorFlow Lite in Android with Google Play Services

Explore TensorFlow Lite integration in Android using Google Play services, enabling efficient ML model deployment, reduced app size, and improved performance without static library bundling.

Add to list

17 Lesons

46 minutes

On-Demand

Free-Video

Venelin Valkov

Build a Neural Network with Python Tutorial - Deep Learning with PyTorch

Learn to build, train, and evaluate a neural network for weather prediction using PyTorch. Covers data preprocessing, model architecture, GPU acceleration, and practical applications.

Add to list

9 Lesons

58 minutes

On-Demand

Free-Video

Prodramp

AI Workshop - Build Your Own Text-to-Image Application with DALL-E Mini in Python from Scratch

Learn to build a text-to-image application using DALL-E mini in Python. Covers model setup, code implementation, and result generation, providing hands-on experience with AI-driven image creation.

Add to list

23 Lesons

40 minutes

On-Demand

Free-Video

1littlecoder

Stable Diffusion 2.0 Fine-Tuning with DreamBooth on Free Colab

Fine-tune Stable Diffusion 2.0 with DreamBooth on free Google Colab. Create custom AI avatars and profile pictures. Learn data collection, processing, model training, inference, and conversion for Automatic1111 compatibility.

Add to list

1 Lesons

28 minutes

On-Demand

Free-Video

GPU/JVM Cooperation - Optimizing Java Performance with GPU Integration

Explore GPU/JVM cooperation, optimization techniques, and performance enhancements for Java applications leveraging GPU capabilities.

Add to list

8 Lesons

40 minutes

On-Demand

Free-Video

Reduction on GPUs

Explore GPU reduction techniques with Eric Caspole, enhancing your understanding of parallel computing and optimizing performance in GPU-based applications.

Add to list

1 Lesons

31 minutes

On-Demand

Free-Video

Project Sumatra - GPU Acceleration for Java

Explore GPU workloads, AMD's Accelerated Processing Unit, and the Sumatra prototype for Java optimization using Stream API and HSAIL technologies.

Add to list

8 Lesons

33 minutes

On-Demand

Free-Video

Simulation Lab

GTC 2017: GPU Computing for the Construction Industry - AR/VR for Learning Planning and Safety

Explore AR/VR applications in construction for enhanced learning, planning, and safety practices, presented by Gilbane Building Company experts at NVIDIA's 2017 conference.

Add to list

1 Lesons

21 minutes

On-Demand

Free-Video

Yannic Kilcher

Processing Megapixel Images with Deep Attention-Sampling Models

Explore a novel deep learning approach for efficiently processing high-resolution images by selectively focusing on informative regions, reducing computational costs while maintaining accuracy.

Add to list

1 Lesons

17 minutes

On-Demand

Free-Video

Open Data Science

RAPIDS - The Platform Inside and Outside - Joshua Patterson - ODSC East 2019

Accelerating data science with GPUs using RAPIDS: open-source libraries for GPU-powered data manipulation, machine learning, and graph analytics, scaling across multiple GPUs and integrating with existing Python ecosystems.

Add to list

38 Lesons

35 minutes

On-Demand

Free-Video

Taking Deep Learning to Production with MLflow and RedisAI