Главная
Study mode:
on
1
Overview
2
First, deploy a prototype with gradio or streamlit
3
Model-in-server architecture
4
Model-in-database architecture
5
Model-as-a-service architecture
6
REST APIs for model services
7
Dependency management for model services
8
Containerization for model services with Docker
9
Performance optimization: to GPU or not to GPU?
10
Optimization for CPUs: distillation, quantization, and caching
11
Optimization for GPUs: Batching and GPU sharing
12
Libraries for model serving on GPUs
13
Horizontal scaling
14
Horizontal scaling with container orchestration k8s
15
Horizontal scaling with serverless services
16
Rollouts: shadows and canaries
17
Managed options for model serving AWS Sagemaker
18
Takeaways on model services
19
Moving to edge
20
Frameworks for edge deployment
21
Making efficient models for the edge
22
Mindsets and takeaways for edge deployment
23
Takeways for deploying ML models
Description:
Explore the process of transforming a promising machine learning model into a valuable ML-powered product in this comprehensive lecture. Learn about various deployment architectures, including model-in-server, model-in-database, and model-as-a-service. Discover how to create prototypes using tools like Gradio and Streamlit, implement REST APIs, manage dependencies, and containerize services with Docker. Delve into performance optimization techniques for both CPUs and GPUs, including distillation, quantization, caching, and batching. Examine horizontal scaling strategies, container orchestration with Kubernetes, and serverless options. Investigate rollout techniques such as shadow and canary deployments, and explore managed services like AWS SageMaker. Finally, gain insights into edge deployment, efficient model creation for edge devices, and key takeaways for successfully deploying ML models in various environments.

Deployment - FSDL 2022

The Full Stack
Add to list
0:00 / 0:00