loading page

Human action classification using CNN by encoding time series skeleton-based data as images
  • +1
  • Muhammad Mu'az Imran ,
  • Azam Che Idris ,
  • Liyanage Chandratilak De Silva ,
  • Hayati Yassin
Muhammad Mu'az Imran
Universiti Brunei Darussalam, Universiti Brunei Darussalam

Corresponding Author:[email protected]

Author Profile
Azam Che Idris
Author Profile
Liyanage Chandratilak De Silva
Author Profile
Hayati Yassin
Author Profile

Abstract

Microsoft Kinect camera can capture depth images of the subject during surveillance of Human Activity Recognition (HAR) and subsequently obtain the skeletal data. Several studies have attempted to use and analyse human actions based on skeletal data and other complex feature representation extraction methods. Most authors have proposed obtaining Spatio-temporal information as one of the extraction methods. Therefore, this study automatically extracts the Spatio-temporal information from the skeletal data by using an Imaging time series (ITS) method called Recurrence Plots (RP) to transform the skeleton joint coordinates into 2D images. The raw data are preprocessed and partitioned into three-channel matrices (R, G, B) before applying the principal component analysis (PCA). The generated RP images are used as input to Convolutional Neural Network (CNN) to distinguish between different activities. The proposed method uses the UTD-MHAD dataset for benchmarking and shows that our approach outperforms previous studies with a maximum accuracy of 92.6%.