Cascading pose features with CNN-LSTM for multiview human action recognition

Human Action Recognition (HAR) is a branch of computer vision that deals with the identification of human actions at various levels including low level, action level, and interaction level. Previously, a number of HAR algorithms have been proposed based on handcrafted methods for action recognition....

Full description

Saved in:
Bibliographic Details
Main Authors: Rehman Malik, Najeeb, Syed Abu Bakar, Syed Abdul Rahman, Sheikh, Usman Ullah, Channa, Asma, Popescu, Nirvana
Format: Article
Language:English
Published: MDPI 2023
Subjects:
Online Access:http://eprints.utm.my/106948/1/SyedAbdulRahman2023_CascadingPoseFeatureswithCNNLSTM.pdf
http://eprints.utm.my/106948/
http://dx.doi.org/10.3390/signals4010002
Tags: Add Tag
No Tags, Be the first to tag this record!
id my.utm.106948
record_format eprints
spelling my.utm.1069482024-08-23T01:32:46Z http://eprints.utm.my/106948/ Cascading pose features with CNN-LSTM for multiview human action recognition Rehman Malik, Najeeb Syed Abu Bakar, Syed Abdul Rahman Sheikh, Usman Ullah Channa, Asma Popescu, Nirvana TK Electrical engineering. Electronics Nuclear engineering Human Action Recognition (HAR) is a branch of computer vision that deals with the identification of human actions at various levels including low level, action level, and interaction level. Previously, a number of HAR algorithms have been proposed based on handcrafted methods for action recognition. However, the handcrafted techniques are inefficient in case of recognizing interaction level actions as they involve complex scenarios. Meanwhile, the traditional deep learning-based approaches take the entire image as an input and later extract volumes of features, which greatly increase the complexity of the systems; hence, resulting in significantly higher computational time and utilization of resources. Therefore, this research focuses on the development of an efficient multi-view interaction level action recognition system using 2D skeleton data with higher accuracy while reducing the computation complexity based on deep learning architecture. The proposed system extracts 2D skeleton data from the dataset using the OpenPose technique. Later, the extracted 2D skeleton features are given as an input directly to the Convolutional Neural Networks and Long Short-Term Memory (CNN-LSTM) architecture for action recognition. To reduce the complexity, instead of passing the whole image, only extracted features are given to the CNN-LSTM architecture, thus eliminating the need for feature extraction. The proposed method was compared with other existing methods, and the outcomes confirm the potential of the proposed technique. The proposed OpenPose-CNNLSTM achieved an accuracy of 94.4% for MCAD (Multi-camera action dataset) and 91.67% for IXMAS (INRIA Xmas Motion Acquisition Sequences). Our proposed method also significantly decreases the computational complexity by reducing the number of inputs features to 50. MDPI 2023-01-04 Article PeerReviewed application/pdf en http://eprints.utm.my/106948/1/SyedAbdulRahman2023_CascadingPoseFeatureswithCNNLSTM.pdf Rehman Malik, Najeeb and Syed Abu Bakar, Syed Abdul Rahman and Sheikh, Usman Ullah and Channa, Asma and Popescu, Nirvana (2023) Cascading pose features with CNN-LSTM for multiview human action recognition. Signals, 4 (1). pp. 40-55. ISSN 2624-6120 http://dx.doi.org/10.3390/signals4010002 DOI:10.3390/signals4010002
institution Universiti Teknologi Malaysia
building UTM Library
collection Institutional Repository
continent Asia
country Malaysia
content_provider Universiti Teknologi Malaysia
content_source UTM Institutional Repository
url_provider http://eprints.utm.my/
language English
topic TK Electrical engineering. Electronics Nuclear engineering
spellingShingle TK Electrical engineering. Electronics Nuclear engineering
Rehman Malik, Najeeb
Syed Abu Bakar, Syed Abdul Rahman
Sheikh, Usman Ullah
Channa, Asma
Popescu, Nirvana
Cascading pose features with CNN-LSTM for multiview human action recognition
description Human Action Recognition (HAR) is a branch of computer vision that deals with the identification of human actions at various levels including low level, action level, and interaction level. Previously, a number of HAR algorithms have been proposed based on handcrafted methods for action recognition. However, the handcrafted techniques are inefficient in case of recognizing interaction level actions as they involve complex scenarios. Meanwhile, the traditional deep learning-based approaches take the entire image as an input and later extract volumes of features, which greatly increase the complexity of the systems; hence, resulting in significantly higher computational time and utilization of resources. Therefore, this research focuses on the development of an efficient multi-view interaction level action recognition system using 2D skeleton data with higher accuracy while reducing the computation complexity based on deep learning architecture. The proposed system extracts 2D skeleton data from the dataset using the OpenPose technique. Later, the extracted 2D skeleton features are given as an input directly to the Convolutional Neural Networks and Long Short-Term Memory (CNN-LSTM) architecture for action recognition. To reduce the complexity, instead of passing the whole image, only extracted features are given to the CNN-LSTM architecture, thus eliminating the need for feature extraction. The proposed method was compared with other existing methods, and the outcomes confirm the potential of the proposed technique. The proposed OpenPose-CNNLSTM achieved an accuracy of 94.4% for MCAD (Multi-camera action dataset) and 91.67% for IXMAS (INRIA Xmas Motion Acquisition Sequences). Our proposed method also significantly decreases the computational complexity by reducing the number of inputs features to 50.
format Article
author Rehman Malik, Najeeb
Syed Abu Bakar, Syed Abdul Rahman
Sheikh, Usman Ullah
Channa, Asma
Popescu, Nirvana
author_facet Rehman Malik, Najeeb
Syed Abu Bakar, Syed Abdul Rahman
Sheikh, Usman Ullah
Channa, Asma
Popescu, Nirvana
author_sort Rehman Malik, Najeeb
title Cascading pose features with CNN-LSTM for multiview human action recognition
title_short Cascading pose features with CNN-LSTM for multiview human action recognition
title_full Cascading pose features with CNN-LSTM for multiview human action recognition
title_fullStr Cascading pose features with CNN-LSTM for multiview human action recognition
title_full_unstemmed Cascading pose features with CNN-LSTM for multiview human action recognition
title_sort cascading pose features with cnn-lstm for multiview human action recognition
publisher MDPI
publishDate 2023
url http://eprints.utm.my/106948/1/SyedAbdulRahman2023_CascadingPoseFeatureswithCNNLSTM.pdf
http://eprints.utm.my/106948/
http://dx.doi.org/10.3390/signals4010002
_version_ 1809136605504471040
score 13.2014675