Главная
Study mode:
on
1
Introduction
2
About Qualcomm AI Research
3
Challenges with AI workloads
4
Model efficiency pipeline
5
Challenges
6
DONNA
7
Fourstep process
8
Example
9
Blocks
10
Models
11
Accuracy predictor
12
Yields
13
Linear regression
14
Evolutionary search
15
Evolutionary sampling
16
Finetuning
17
Results
18
Model pruning
19
Unstructured pruning
20
Structured compression
21
Main takeaway
22
Quantization research
23
Quantization
24
Recent papers
25
Adaptive rounding
26
AI model efficiency tool
27
Key results
28
Highlevel view
29
Mixed precision
30
Mixed precision on a chip
31
APQ
32
Running networks conditionally
33
Classification example
34
Multiscale dense nets
35
Semantic segmentation
36
Dynamic convolutions
37
Video processing
38
Skip convolutions
39
Video classification
40
Summary
41
Questions
42
Sponsors
Description:
Explore a comprehensive keynote presentation from the tinyML EMEA 2021 conference focusing on the model efficiency pipeline for enabling deep learning inference at the edge. Delve into the challenges of deploying AI applications on low-power edge devices and wearable platforms, and discover a systematic approach to optimize deep learning models. Learn about Hardware-Aware Neural Architecture Search, compression and pruning techniques, and state-of-the-art quantization tools. Gain insights into mixed-precision hardware-aware neural architecture search and conditional processing as future trends in efficient edge computing. Examine real-world examples, key results, and practical applications across various domains, including video processing and semantic segmentation.

The Model Efficiency Pipeline: Enabling Deep Learning Inference at the Edge

tinyML
Add to list
0:00 / 0:00