Play all

Introduction

About Qualcomm AI Research

Challenges with AI workloads

Model efficiency pipeline

Challenges

DONNA

Fourstep process

Example

Blocks

Models

Accuracy predictor

Yields

Linear regression

Evolutionary search

Evolutionary sampling

Finetuning

Results

Model pruning

Unstructured pruning

Structured compression

Main takeaway

Quantization research

Quantization

Sponsors

Description:

Explore a comprehensive keynote presentation from the tinyML EMEA 2021 conference focusing on the model efficiency pipeline for enabling deep learning inference at the edge. Delve into the challenges of deploying AI applications on low-power edge devices and wearable platforms, and discover a systematic approach to optimize deep learning models. Learn about Hardware-Aware Neural Architecture Search, compression and pruning techniques, and state-of-the-art quantization tools. Gain insights into mixed-precision hardware-aware neural architecture search and conditional processing as future trends in efficient edge computing. Examine real-world examples, key results, and practical applications across various domains, including video processing and semantic segmentation.

The Model Efficiency Pipeline: Enabling Deep Learning Inference at the Edge

tinyML

Add to list

#Computer Science #Deep Learning #Machine Learning #Quantization #Programming #Cloud Computing #Edge Computing #Artificial Intelligence #Computer Vision #Semantic Segmentation #Neural Architecture Search #Model Compression

0:00 / 0:00