Modeling Two-Person Segmentation and Locomotion for Stereoscopic Action Identification: A Sustainable Video Surveillance System

Khalid, Nida; Gochoo, Munkhjargal; Jalal, Ahmad; Kim, Kibum

Modeling Two-Person Segmentation and Locomotion for Stereoscopic Action Identification: A Sustainable Video Surveillance System

Nida Khalid, Munkhjargal Gochoo, Ahmad Jalal and Kibum Kim
Additional contact information
Nida Khalid: Department of Computer Science, Air University, Islamabad 44000, Pakistan
Munkhjargal Gochoo: Department of Computer Science and Software Engineering, United Arab Emirates University, Al Ain 15551, UAE
Ahmad Jalal: Department of Computer Science, Air University, Islamabad 44000, Pakistan
Kibum Kim: Department of Human-Computer Interaction, Hanyang University, Ansan 15588, Korea

Sustainability, 2021, vol. 13, issue 2, 1-30

Abstract: Due to the constantly increasing demand for automatic tracking and recognition systems, there is a need for more proficient, intelligent and sustainable human activity tracking. The main purpose of this study is to develop an accurate and sustainable human action tracking system that is capable of error-free identification of human movements irrespective of the environment in which those actions are performed. Therefore, in this paper we propose a stereoscopic Human Action Recognition (HAR) system based on the fusion of RGB (red, green, blue) and depth sensors. These sensors give an extra depth of information which enables the three-dimensional (3D) tracking of each and every movement performed by humans. Human actions are tracked according to four features, namely, (1) geodesic distance; (2) 3D Cartesian-plane features; (3) joints Motion Capture (MOCAP) features and (4) way-points trajectory generation. In order to represent these features in an optimized form, Particle Swarm Optimization (PSO) is applied. After optimization, a neuro-fuzzy classifier is used for classification and recognition. Extensive experimentation is performed on three challenging datasets: A Nanyang Technological University (NTU) RGB+D dataset; a UoL (University of Lincoln) 3D social activity dataset and a Collective Activity Dataset (CAD). Evaluation experiments on the proposed system proved that a fusion of vision sensors along with our unique features is an efficient approach towards developing a robust HAR system, having achieved a mean accuracy of 93.5% with the NTU RGB+D dataset, 92.2% with the UoL dataset and 89.6% with the Collective Activity dataset. The developed system can play a significant role in many computer vision-based applications, such as intelligent homes, offices and hospitals, and surveillance systems.

Keywords: geodesic distance; human action recognition; human locomotion; neuro-fuzzy classifier; particle swarm optimization; RGB-D sensors; trajectory features (search for similar items in EconPapers)
JEL-codes: O13 Q Q0 Q2 Q3 Q5 Q56 (search for similar items in EconPapers)
Date: 2021
References: View references in EconPapers View complete reference list from CitEc
Citations: View citations in EconPapers (3)

Downloads: (external link)
https://www.mdpi.com/2071-1050/13/2/970/pdf (application/pdf)
https://www.mdpi.com/2071-1050/13/2/970/ (text/html)

Related works:
This item may be available elsewhere in EconPapers: Search for items with the same title.

Export reference: BibTeX RIS (EndNote, ProCite, RefMan) HTML/Text

Persistent link: https://EconPapers.repec.org/RePEc:gam:jsusta:v:13:y:2021:i:2:p:970-:d:482841

Access Statistics for this article

Sustainability is currently edited by Ms. Alexandra Wu

More articles in Sustainability from MDPI
Bibliographic data for series maintained by MDPI Indexing Manager ().