Play all

Intro

Action Recognition

Temporal Action Localization

Outline

Knowledge Transfer to Novel Categories

Comparison with Semantic Attributes (THUMOS)

Experiment Results on UCF101

Pipeline Overview

Detecting Actions

Experimental Setup

Detection examples

Generalizing Faster R-CNN from 2D to 3D

Tube Proposal Network

Tube of Interest Max Pooling

Experiment results on UCF-Sports

Evaluation on YouTube Videos

Limitations

Video Action Segmentation -- Overview

Video Object Segmentation -- Overview

Video Object Segmentation -- Encoder

Video Object Segmentation - 3D Pyramid Pooling

Video Object Segmentation -- Decoder

Dilated Convolution

Summary

Future Work

Description:

Explore action recognition, temporal localization, and detection in trimmed and untrimmed videos through this 36-minute lecture by Rui Hou from the University of Central Florida. Dive into topics such as knowledge transfer to novel categories, comparison with semantic attributes, and experimental results on datasets like UCF101 and UCF-Sports. Learn about the pipeline for detecting actions, including the adaptation of Faster R-CNN from 2D to 3D, Tube Proposal Network, and Tube of Interest Max Pooling. Examine video action and object segmentation techniques, including encoder-decoder architectures, 3D Pyramid Pooling, and dilated convolution. Gain insights into the current limitations and future directions of research in this field.

Action Recognition, Temporal Localization and Detection in Videos

University of Central Florida

Add to list

#Computer Science #Artificial Intelligence #Computer Vision #Machine Learning #Encoder-Decoder Architecture

0:00 / 0:00