An Improved Empirical Hyper-Parameter Tuned Supervised Model for Human Activity Recognition based on Motion Flow and Deep Learning

doi:10.23940/ijpe.22.11.p6.808816

Abstract

Abstract: Traditional pattern recognition methods rely on manual feature-extraction, which may result in the poor generalization of the model. With the increase in the popularity and success of deep learning methods, it is widely adopted in Human Activity Recognition (HAR). The ability of HAR can be extended to automated surveillance systems. In this paper, a deep learning and motion flow based Incept_LSTM is proposed. The proposed method extends the capability of pre-trained Inception-v3 and Long Short-Term Memory (LSTM). The hybridization of these models sustains a spatio-temporal convergence which is validated by the results so obtained. The proposed model is trained and validated on UCF-Crime dataset. The obtained results are then compared with the work done in the literature on the UCF-Crime dataset, KTH, and UCF-Crime2Local. It has achieved an accuracy of 98.2% and 94.57% on training and validation, respectively. Testing the effectiveness of RMSProp optimizer (as opposed to Adam) with 1e-6 learning rate has given best fit with 0.2 training and 0.38 validation loss. The model takes the advantage of motion flow computed using Lucas-Kanade Method. Motion flow is the important paradigm for considering video data. The proposed method outperforms the state-of-the-art methods in terms of accuracy, number of parameters and processing time. Also, various hyper-parameter settings are performed for the best training results.

Key words: human activity recognition, feature extraction, Inception v3, LSTM, optical flow, hyper-parameter tuning, UCF-crime dataset

Palak Girdhar, Prashant Johri, and Deepali Virmani. An Improved Empirical Hyper-Parameter Tuned Supervised Model for Human Activity Recognition based on Motion Flow and Deep Learning [J]. Int J Performability Eng, 2022, 18(11): 808-816.

Add to citation manager EndNote|Reference Manager|ProCite|BibTeX|RefWorks

References

1. Ahmed, M. and Pathan, A.S.K. Deep Learning for Collective Anomaly Detection. International Journal of Computational Science and Engineering, vol. 21, no. 1, pp. 137-145, 2020.
2. Feichtenhofer C., Pinz A., andZisserman A.Convolutional Two-Stream Network Fusion for Video Action Recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 1933-1941, 2016.
3. Franco A., Magnani A., andMaio D.A Multimodal Approach for Human Activity Recognition based on Skeleton and RGB Data. Pattern Recognition Letters, vol. 131, pp. 293-299, 2020.
4. Simonyan, K. and Zisserman, A.Two-Stream Convolutional Networks for Action Recognition in Videos. Advances in neural information processing systems, vol. 27, 2014.
5. Ranasinghe S.,Al Machot, F., and Mayr, H.C. A Review on Applications of Activity Recognition Systems with Regard to Performance and Evaluation. International Journal of Distributed Sensor Networks, vol. 12, no. 8, pp. 1550147716665520, 2016.
6. Vishwakarma, S. and Agrawal, A.A Survey on Activity Recognition and Behavior Understanding in Video Surveillance. The Visual Computer, vol. 29, no. 10, pp. 983-1009, 2013.
7. Cheng G., Wan Y., Saudagar A.N., Namuduri K., andBuckles, B.P. Advances in Human Action Recognition: A Survey. arXiv preprint arXiv:1501.05964, 2015
8. Mutegeki, R. and Han, D.S. A CNN-LSTM Approach to Human Activity Recognition. In2020 International Conference on Artificial Intelligence in Information and Communication (ICAIIC), IEEE, pp. 362-366, 2020.
9. Jindal M., Bajal E., Chakraborty A., Singh P., Diwakar M., andKumar, N. A Novel Multi-Focus Image Fusion Paradigm: A Hybrid Approach. Materials Science, vol.2214, pp. 7853, 2020.
10. Yue-Hei Ng, J., Hausknecht, M., Vijayanarasimhan, S., Vinyals, O., Monga, R., and Toderici, G. Beyond Short Snippets: Deep Networks for Video Classification. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 4694-4702, 2015.
11. Diba A., Fayyaz M., Sharma V., Karami A.H., Arzani M.M., Yousefzadeh R., andVan Gool, L. Temporal 3D Convnets: New Architecture and Transfer Learning for Video Classification. arXiv preprint arXiv:1711.08200, 2017.
12. Landi F., Snoek C.G., andCucchiara R.Anomaly Locality in Video Surveillance. arXiv preprint arXiv:1901.10364, 2019.
13. Khan A., Sohail A., Zahoora U., andQureshi A.S.A Survey of the Recent Architectures of Deep Convolutional Neural Networks. Artificial intelligence review, vol. 53, no. 8, pp. 5455-5516, 2020.
14. Karpathy A., Toderici G., Shetty S., Leung T., Sukthankar R., andFei-Fei, L. Large-Scale Video Classification with Convolutional Neural Networks. In Proceedings of the IEEE conference on Computer Vision and Pattern Recognition, pp. 1725-1732, 2014.
15. Lin J., Gan C., andHan S.Tsm: Temporal Shift Module for Efficient Video Understanding. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 7083-7093, 2019.
16. Xia K., Huang J., andWang H.LSTM-CNN Architecture for Human Activity Recognition. IEEE Access, vol. 8, pp. 56855-56866, 2020.
17. Ghosh A., Sufian A., Sultana F., Chakrabarti A., andDe D.Fundamental Concepts of Convolutional Neural Network. In Recent trends and advances in artificial intelligence and Internet of Things, Springer, Cham, pp. 519-567, 2020.
18. Donahue J.,Anne Hendricks, L., Guadarrama, S., Rohrbach, M., Venugopalan, S., Saenko, K., and Darrell, T. Long-Term Recurrent Convolutional Networks for Visual Recognition and Description. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 2625-2634, 2015.
19. Wang L., Xiong Y., Wang Z., Qiao Y., Lin D., Tang X., andGool L.V.Temporal Segment Networks: Towards Good Practices for Deep Action Recognition. In European conference on computer vision, Springer, Cham, pp. 20-36, 2016
20. Tran D., Bourdev L., Fergus R., Torresani L., andPaluri M.Learning Spatiotemporal Features with 3D Convolutional Networks. In Proceedings of the IEEE international conference on computer vision, pp. 4489-4497, 2015.
21. Carreira, J. and Zisserman, A.Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset. Inproceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 6299-6308, 2017.
22. Tran D., Wang H., Torresani L., Ray J., LeCun, Y., and Paluri, M. A Closer Look at Spatiotemporal Convolutions for Action Recognition. In Proceedings of the IEEE conference on Computer Vision and Pattern Recognition, pp. 6450-6459, 2018.
23. Ullah W., Ullah A., Haq I.U., Muhammad K., Sajjad M., andBaik S.W.CNN Features with Bi-directional LSTM for Real-Time Anomaly Detection in Surveillance Networks. Multimedia Tools and Applications, vol. 80, no. 11, pp. 16979-16995, 2021.
24. Girdhar P., Johri P., andVirmani D.Incept_LSTM: Accession for Human Activity Concession in Automatic Surveillance. Journal of Discrete Mathematical Sciences and Cryptography, pp. 1-15, 2020.
25. Sundar, S. and Sumathy, S.Transfer Learning Approach in Deep Neural Networks for Uterine Fibroid Detection. International Journal of Computational Science and Engineering, vol. 25, no. 1, pp. 52-63, 2022.
26. Szegedy C., Liu W., Jia Y., Sermanet P., Reed S., Anguelov D., Erhan D., Vanhoucke V., andRabinovich A.Going Deeper with Convolutions. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 1-9, 2015.
27. Sevilla-Lara, L., Liao, Y., Güney, F., Jampani, V., Geiger, A., and Black, M.J. On the Integration of Optical Flow and Action Recognition. In German conference on pattern recognition, Springer, Cham, pp. 281-297, 2018.
28. Mliki H., Bouhlel F., andHammami M.Human Activity Recognition from UAV-Captured Video Sequences. Pattern Recognition, vol. 100, pp. 107140, 2020.
29. Xu D., Yan Y., Ricci E., andSebe N.Detecting Anomalous Events in Videos by Learning Deep Representations of Appearance and Motion. Computer Vision and Image Understanding, vol. 156, pp. 117-127, 2017
30. Choi D., Shallue C.J., Nado Z., Lee J., Maddison C.J., andDahl G.E.On Empirical Comparisons of Optimizers for Deep Learning. arXiv preprint arXiv:1910.05446, 2019.
31. Biradar K., Dube S., andVipparthi, S.K. DEARESt: Deep Convolutional Aberrant Behavior Detection in Real-World Scenarios. In2018 IEEE 13th International Conference on Industrial and Information Systems (ICIIS), IEEE, pp. 163-167, 2018.
32. KNair L.Prediction of Anomalous Activities in A Video,International Research Journal of Engineering and Technology, Vol. 05, pp. 3470-3475, 2018.
33. Begampure, S. and Jadhav, P.Intelligent Video Analytics for Human Action Detection: A Deep Learning Approach with Transfer Learning. International Journal of Computing and Digital System, 2021.
34. Ullah W., Ullah A., Hussain T., Khan Z.A., andBaik S.W.An Efficient Anomaly Recognition Framework using an Attention Residual LSTM in Surveillance Videos. Sensors, vol. 21, no. 8, pp. 2811, 2021.

[1]	Savita Khurana, Gaurav Sharma, and Bhawna Sharma. Hybrid Machine Learning Model for Load Prediction in Cloud Environment [J]. Int J Performability Eng, 2023, 19(8): 507-515.
[2]	P. Antony Seba and J. V. Bibal Benifa. Hybrid Outlier Detection Strategy and Weighted Decision Matrix Ordinal Classifier for CKD Severity Prediction [J]. Int J Performability Eng, 2023, 19(2): 144-154.
[3]	Sonika Jindal, Monika Sachdeva, and Alok Kumar Singh Kushwaha. Human Activity Recognition using Ensemble Convolutional Neural Networks and Long Short-Term Memory [J]. Int J Performability Eng, 2022, 18(9): 660-667.
[4]	Sandhya Alagarsamy and Visumathi James. RNN LSTM-based Deep Hybrid Learning Model for Text Classification using Machine Learning Variant XGBoost [J]. Int J Performability Eng, 2022, 18(8): 545-551.
[5]	Sagnik Pal, Rutvik Patel, Vijayasherly V., and Ramani Selvanambi. Hashtag Recommendation System for Instagram Posts using Transfer Learning with EfficientNet and ALS Model [J]. Int J Performability Eng, 2022, 18(8): 552-558.
[6]	Keshav H. Jatakar, Gopal Mulgund, Abhishek D. Patange, B. B. Deshmukh, and Kishor S. Rambhad. Multi-Point Face Milling Tool Condition Monitoring Through Vibration Spectrogram and LSTM-Autoencoder [J]. Int J Performability Eng, 2022, 18(8): 570-579.
[7]	Poonam Narang, Ajay Vikram Singh, and Himanshu Monga. Hybrid Metaheuristic Approach for Detection of Fake News on Social Media [J]. Int J Performability Eng, 2022, 18(6): 434-443.
[8]	Roop Preet Kaur, Anshu Sharma, Inderpal Singh, and Rahul Malhotra. Deep Learning-Based Pneumonia Recognition from Chest X-Ray Images [J]. Int J Performability Eng, 2022, 18(5): 380-386.
[9]	D. Bhavana, K. Kishore Kumar, Medasani Bipin Chandra, P.V. Sai Krishna Bhargav, D. Joy Sanjanaa, and G. Mohan Gopi. Hand Sign Recognition using CNN [J]. Int J Performability Eng, 2021, 17(3): 314-321.
[10]	D. Deva Hema and K. Ashok Kumar. An Optimized Intelligent Driver’s Aggressive Behaviour Prediction Model Using GA-LSTM [J]. Int J Performability Eng, 2021, 17(10): 880-888.
[11]	Narayani Patil, Kalyani Ingole, and T. Rajani Mangala. Deep Convolutional Neural Networks Approach for Classification of Lung Diseases using X-Rays: COVID-19, Pneumonia, and Tuberculosis [J]. Int J Performability Eng, 2020, 16(9): 1332-1340.
[12]	Bo Dan, Shan Gao, and Zhihong Ji. Ship Target Recognition Technology of Radar High Resolution Range Profile based on Machine Learning [J]. Int J Performability Eng, 2020, 16(4): 537-548.
[13]	Chenyang Zhao, and Junling Wang. Service Recommendation Model based on Rating Matrix and Context-Embedded LSTM [J]. Int J Performability Eng, 2019, 15(9): 2432-2441.
[14]	Shuang Liu, Xing Cui, Jiayi Li, Hui Yang, and Niko Lukač. Pedestrian Detection based on Faster R-CNN [J]. Int J Performability Eng, 2019, 15(7): 1792-1801.
[15]	Wenfang Zhao, Yong Zhou, and Wei Tang. Novel Convolution and LSTM Model for Forecasting PM2.5 Concentration [J]. Int J Performability Eng, 2019, 15(6): 1528-1537.