Collaborative Filtering Recommendation Algorithm based on Spark

doi:10.23940/ijpe.19.03.p22.930938

Abstract

Abstract: With the advent of the era of big data, the problem of information overload has become particularly serious. The recommendation system can provide personalized recommendation services for users by analyzing users' basic information and users' behavior information. How to push information accurately and efficiently has become an urgent issue in the era of big data. Based on the Alternating Least Squares (ALS) collaborative filtering recommendation algorithm, this paper reduces the loss of the invisible factor item attribute information by merging the similarity of the item on the loss function. At the same time, the cold start strategy is introduced into the model to complete the recommendation. The algorithm is implemented on the Spark distributed platform and single node by using the Movie Lens dataset published by the GroupLens Lab. The experiment results show that the proposed recommendation algorithm can preferably alleviate the data sparsity problem compared with the traditional recommendation algorithm. Moreover, the algorithm improves the accuracy of recommendation and the efficiency of calculation.

Key words: recommendation system, collaborative filtering, matrix decomposition, spark

Jinhong Tao, Jianhou Gan, and Bin Wen. Collaborative Filtering Recommendation Algorithm based on Spark [J]. Int J Performability Eng, 2019, 15(3): 930-938.

Add to citation manager EndNote|Reference Manager|ProCite|BibTeX|RefWorks

References

1. F. Ricci, L. Rokach, and B. Shapira, “Recommender Systems Handbook,” 2nd Edition, Springer-Verlag New York Inc., New York, USA, November 2015
2. D. S. Li, C. Chen, Q. Lv,H. S. Gu, “An Adaptive Learning Rate Method for Matrix Approximation-based Collaborative Filtering,” inProceedings of 2018 World Wide Web Conferences, pp. 741-751, Lyon, France, July 2018
3. N. E. I.Karabadji, S. Beldjoudi, H. Serid, S. Aridhi, and W. Dhifli, “Improving Memory-based User Collaborative Filtering with Evolutionary Multi-Objective Optimization,” Expert Systems with Applications, Vol. 98, pp. 153-165, May 2018
4. B. Shams and S. Haratizadeh, “Item-based Collaborative Ranking,” Knowledge-based Systems, Vol. 152, pp. 172-185, July 2018
5. T. M. Huynh, H. H. Huynh, V. T. Tran,H. X. Huynh, “Collaborative Filtering Recommender System base on The Interaction Multi-Criteria Decision with Ordered Weighted Averaging Operator,” inProceedings of the 2nd International Conference on Machine Learning and Soft Computing, pp. 45-49, New York, USA, February 2018
6. E. Karydi and K. G. Margaritis, “Parallel and Distributed Collaborative Filtering: A Survey,” ACM Computing Surveys Surveys Homepage Archive (CSUR), Vol. 49, No. 2, pp. 1-46, New York, USA, November 2016
7. Y. Wang and L. He, “Research and Optimization of Data Sparsity in Collaborative Filtering Algorithms,” Recent Developments in Intelligent Computing, Communication and Devices, Vol. 752, pp. 87-92, Singapore, August 2018
8. J. Xu, Y. S. Zhong, W. Q. Zhu,F. F. Sun, “Trust-based Context-Aware Mobile Social Network Service Recommendation,” Wuhan University Journal of Natural Sciences, Vol. 22, No. 2, pp. 149-156, Wuhan, China, April 2017
9. Y. C. Jing, W. Jiang, G. Y. Su, Z. S. Zhou,Y. F. Wang, “A Learning Automata-based Singular Value Decomposition and its Application in Recommendation System,” Springer International Publishing, pp. 26-32, Taiyuan, China, August 2014
10. Z. J. Wang, N. N. Yu,J. X. Wang, “User Attributes Clustering-based Collaborative Filtering Recommendation Algorithm and its Parallelization on Spark,” inProceedings of Asian Simulation Conference, pp. 442-451, Beijing, China, September 2016
11. L. Zhang, X. Song,Y. J. Wu, “Theory, Methodology, Tools and Applications for Modeling and Simulation of Complex Systems,” in Proceedings of 16th Asia Simulation Conference and SCS Autumn Simulation Multi-Conference, Beijing, China, October 2016
12. C. Guan and K. K. F. Yuen, “Towards a Hybrid Approach of Primitive Cognitive Network Process and Agglomerative Hierarchical Clustering for Music Recommendation,” inProceedings of 11th International Conference on Heterogeneous Networking for Quality, Reliability, Security and Robustness (QSHINE), pp. 206-209, Taipei, Taiwan, November 2015
13. Z. D.Zhao and M. S. Shang, “User-based Collaborative Filtering Recommendation Algorithms on Hadoop,” inProceedings of International Conference on Knowledge Discovery & Data Mining, pp. 478-481, Phuket, Thailand, January 2010
14. L. Fan, H. Li,C. F. Li, “The Improvement and Implementation of Distributed Item-based Collaborative Filtering Algorithm on Hadoop,” inProceedings of 34th Chinese Control Conference (CCC), pp. 9078-9083, Hangzhou, China, July 2015
15. K. H. Lin, J. J. Wang,M. H. Wang, “A Hybrid Recommendation Algorithm based on Hadoop,” inProceedings of 9th International Conference on Computer Science and Education, pp. 540-543, Vancouver, BC, Canada, August 2014
16. B. Kupisz and O. Unold, “Collaborative Filtering Recommendation Algorithm based on Hadoop and Spark,” inProceedings of 2015 IEEE International Conference on Industrial Technology (ICIT), pp. 1510-1514, Seville, Spain, June 2015
17. S. Salloum, R. Dautov, X. J. Chen, P. X. G.Peng, and J. Z. X. Huang, “Big Data Analytics on Apache Spark,” International Journal of Data Science and Analytics, Vol. 1, No. 3-4, pp. 145-164, November 2016
18. Y. Samadi, M. Zbakh,C. Tadonki, “Performance Comparison Between Hadoop and Spark Frameworks using HiBench Benchmarks,”Concurrency and Computation Practice & Experience, pp. 1-13, Wiley, USA, November 2017
19. R. K. Mishra, “Spark Architecture and The Resilient Distributed Dataset,”PySpark Recipes, pp. 85-114, Apress, Berkeley, CA, December 2017
20. D. G. García, S. G. Ramírez,S. García, “A Comparison on Scalability for Batch Big Data Processing on Apache Spark and Apache Flink,” Big Data Anal, Vol. 2, No. 1, pp. 1, March 2017
21. I. S. Wahyudi, A. Affandi,M. Hariadi, “Recommender Engine using Cosine Similarity based on Alternating Least Square-Weight Regularization,” inProceedings of 15th International Conference on Quality in Research (QiR), International Symposium on Electrical and Computer Engineering, pp. 256-261, Nusa Dua, Indonesia, July 2017
22. C. Verma and R. Pandey, “Big Data Representation for Grade Analysis Through Hadoop Framework,” inProceedings of 6th International Conference - Cloud System and Big Data Engineering (Confluence), pp. 312-315, Noida, India, January 2016
23. R. Katarya and O. P. Verma, “Recommender System with Grey Wolf Optimizer and FCM,” Neural Computing & Applications, Vol. 30, No. 5, pp. 1679-1687, September 2018
24. C. Selvi and E. Sivasankar, “A Novel Singularity based Improved Tanimoto Similarity Measure for Effective Recommendation using Collaborative Filtering,” inProceedings of 2018 8th International Conference on Cloud Computing, Data Science & Engineering (Confluence), pp. 256-262, Noida, India, January 2018

[1]	Nittu Goutham, Karan Singh, Latha Banda, Purushottam Sharma, Chaman Verma, and S. B. Goyal. ShAD-SEF: An Efficient Model for Shilling Attack Detection using Stacking Ensemble Framework in Recommender Systems [J]. Int J Performability Eng, 2023, 19(5): 291-302.
[2]	Devendra Gautam, Anurag Dixit, Latha Banda, Harish Kumar, Purushottam Sharma, and Chaman Verma. Quality Enhancement of Recommendation using Improved Triangle Ratings [J]. Int J Performability Eng, 2023, 19(2): 105-114.
[3]	Priyanshu Verma, Ishan Sharma, Sonia Deshmukh, and Rohit Vashisht. Customer Churn Analysis using Spark and Hadoop [J]. Int J Performability Eng, 2023, 19(10): 663-675.
[4]	Mansi Mahendru and Sanjay Kumar Dubey. Portable Learning Approach towards Capturing Social Intimidating Activities using Big Data and Deep Learning Technologies [J]. Int J Performability Eng, 2022, 18(9): 668-678.
[5]	Poonam Narang, Ajay Vikram Singh, and Himanshu Monga. Hybrid Metaheuristic Approach for Detection of Fake News on Social Media [J]. Int J Performability Eng, 2022, 18(6): 434-443.
[6]	Angel Arul Jothi J and Razia Sulthana A. A Review on the Literature of Fashion Recommender System using Deep Learning [J]. Int J Performability Eng, 2021, 17(8): 695-702.
[7]	Anuja Arora, and Anu Taneja. Research Issues, Innovation and Associated Approaches for Recommendation on Social Networks [J]. Int J Performability Eng, 2021, 17(12): 1027-1036.
[8]	Di Yu, Ruyun Chen, Juan Chen. Video Recommendation Algorithm based on Knowledge Graph and Collaborative Filtering [J]. Int J Performability Eng, 2020, 16(12): 1933-1940.
[9]	Chenyang Zhao, and Junling Wang. Service Recommendation Model based on Rating Matrix and Context-Embedded LSTM [J]. Int J Performability Eng, 2019, 15(9): 2432-2441.
[10]	Huaiguang Wu, Yongsheng Shi, Shenyi Qian, Hongwei Tao, and Jiangtao Ma. Application of Improved Feature Pre-processing Method in Prevention and Control of Electricity Charge Risk [J]. Int J Performability Eng, 2019, 15(9): 2453-2461.
[11]	Xiaohui Cheng, Li Feng, and Qiong Gui. Collaborative Filtering Algorithm based on Data Mixing and Filtering [J]. Int J Performability Eng, 2019, 15(8): 2267-2276.
[12]	Hui Xu, Qianqian Cao, Heng Fu, and Hongwei Chen. Applying an Improved Elephant Herding Optimization Algorithm with Spark-based Parallelization to Feature Selection for Intrusion Detection [J]. Int J Performability Eng, 2019, 15(6): 1600-1610.
[13]	Chunxu Wang, Haiyan Wang, Jingwen Pi, and Li An. Park Recommendation Algorithm based on User Reviews and Ratings [J]. Int J Performability Eng, 2019, 15(3): 803-812.
[14]	Chaoyang Ji. A Heuristic Collaborative Filtering Recommendation Algorithm based on Book Personalized Recommendation [J]. Int J Performability Eng, 2019, 15(11): 2936-2943.
[15]	Wenqian Shang, Kaixiang Wang, and Junjie Huang. An Improved Tensor Decomposition Model for Recommendation System [J]. Int J Performability Eng, 2018, 14(9): 2116-2126.