Play all

Overview

First, deploy a prototype with gradio or streamlit

Model-in-server architecture

Model-in-database architecture

Model-as-a-service architecture

REST APIs for model services

Dependency management for model services

Containerization for model services with Docker

Performance optimization: to GPU or not to GPU?

Optimization for CPUs: distillation, quantization, and caching

Optimization for GPUs: Batching and GPU sharing

Libraries for model serving on GPUs

Horizontal scaling

Horizontal scaling with container orchestration k8s

Horizontal scaling with serverless services

Rollouts: shadows and canaries

Managed options for model serving AWS Sagemaker

Takeaways on model services

Moving to edge

Frameworks for edge deployment

Making efficient models for the edge

Mindsets and takeaways for edge deployment

Takeways for deploying ML models

Description:

Explore the process of transforming a promising machine learning model into a valuable ML-powered product in this comprehensive lecture. Learn about various deployment architectures, including model-in-server, model-in-database, and model-as-a-service. Discover how to create prototypes using tools like Gradio and Streamlit, implement REST APIs, manage dependencies, and containerize services with Docker. Delve into performance optimization techniques for both CPUs and GPUs, including distillation, quantization, caching, and batching. Examine horizontal scaling strategies, container orchestration with Kubernetes, and serverless options. Investigate rollout techniques such as shadow and canary deployments, and explore managed services like AWS SageMaker. Finally, gain insights into edge deployment, efficient model creation for edge devices, and key takeaways for successfully deploying ML models in various environments.

Deployment - FSDL 2022

The Full Stack

Add to list

#Computer Science #Deep Learning #DevOps #Docker #Programming #Programming Languages #Python #Streamlit #Containerization #Software Engineering #Scalability #Horizontal Scaling #Gradio

0:00 / 0:00