Loading Events

« All Events

  • This event has passed.

Yuexi Zhang’s PhD Proposal Review

November 2, 2022 @ 12:00 pm - 1:00 pm

“Human Body and Activity Analysis”

Abstract:

Human-related applications such as person detection, human pose estimations and human activity recognition, that always draw a lot of attentions in computer vision community. In this proposal, we discuss several related topics that we are interested in, and demonstrate how we improve the existing methods. The first problem we consider is video-based human pose estimation. For most general approaches, researchers focus on collecting human poses from each frame independently and then associate them based on matching or tracking methods. However, such the pipeline usually relies on complex computations and also consumes running time. To overcome such shortages, we propose a light weighted network with the unsupervised training strategy, that aims to reduce running time but remaining the performance. The next problem we explore is about cross-view action recognition (CVAR). The goal of CVAR is to recognize a human action when observed from a previously unseen viewpoint. This is important for some applications such as surveillance systems where is not practical or feasible to collect large amounts of training data when adding a new camera. In this case, it requires methods that are able to generate reliable view-invariant information trained with given viewpoints and recognize the action from an unseen viewpoint. In general, most approaches rely on 3D data, but using 2D data is still under-discovered. Besides, the performance of those approaches using only 2D data is far worse than 3D approaches. Therefore, we propose a simple yet efficient CVAR framework that takes 2D data as input and close the performance gap between 3D and 2D input. The last problem we investigate is online action detection and we are interested in detecting action start at current stage. Online action start detection problem is to detect an action startpoint as soon as it occurs with its action category in untrimmed, streaming videos, and it has potential applications such as early alert generation in surveillance systems. The typical approaches usually heavily rely on frame-level annotations and also they are limited to pre-defined action categories. Therefore, we propose a novel yet simple design, 3D MLP-mxier based architecture that aims to detect the taxonomy-free action start without using frame-level annotations.

 

Committee:

Dr. Octavia Camps(Advisor)

Dr. Mario Sznaier

Dr. Sarah Ostadabbas

Details

Date:
November 2, 2022
Time:
12:00 pm - 1:00 pm
Website:
https://northeastern.zoom.us/j/94979531914?pwd=Q05idUw1aysrSFg0RGJVUkwxcVZCZz09

Other

Department
Electrical and Computer Engineering
Topics
MS/PhD Thesis Defense
Audience
Faculty, Staff