Modeling spatio temporal cues in a deep learning framework for human action recognition

Abstract

The investigations reported in the thesis are on video analytics for human action recognition. Motivated by the proven ability of deep learning networks in diverse range of application domains, this thesis is focused towards the formulation of deep learning based techniques for human action classification. These formulations have been configured around the vast unexplored potential of deep learning frameworks in the application domain of human action recognition. newlineThe human action recognition poses a wide range of challenges in terms of intra class variability and dynamically varying complex backgrounds. Further, the complexity of the problem also varies depending on the number of action classes, nature of actions and dynamics of the background. Therefore, four different formulations of varying complexities have been presented in this thesis and demonstrated with standard datasets, whose complexities match that of the proposed formulations. The first formulation is a simple Spiking Neural Network (SNN) based classifier operating on handcrafted features. The second formulation is built around single dimensional deep learning neural network based classifier, operating on the optimal feature set presented by a Particle Swarm Optimization (PSO) technique. The third one is a hybrid deep learning framework, with a Convolutional Neural Network (CNN) based feature extractor working in conjunction with SNN based classifier. Finally, a Siamese convolutional framework has been proposed, specifically targeting human fall detection. newlineIn the first method, the performance of the SNN as a classifier for human action recognition has been experimentally evaluated newline newline

Description

Keywords

Citation

item.page.endorsement

item.page.review

item.page.supplemented

item.page.referenced