Explore a conference talk on enabling elastic inference on edge using Knative and EDL, presented by Ti Zhou and Daxiang Dong from Baidu. Learn how Baidu leverages these technologies to deploy recommendation model inference on distributed CDN nodes, resulting in a 25% reduction in core network load and a 40% decrease in network communication latency. Discover the experiences and lessons learned from optimizing edge deployment, including auto-scaling, ingress traffic optimization based on geographic location, cold start optimization with model pre-loading, and legacy services interaction with Cloud Events. Gain insights into PaddlePaddle, large-scale training, Elastic Deep Learning (EDL), fault-tolerant training, and elastic knowledge distillation. Understand the benefits of Knative, including its serving capabilities for blue-green deployments, ingress, autoscaling, events, and observability features.
Enabling Elastic Inference on Edge With Knative and EDL