Int J Performability Eng ›› 2019, Vol. 15 ›› Issue (3): 930-938.doi: 10.23940/ijpe.19.03.p22.930938

Previous Articles     Next Articles

Collaborative Filtering Recommendation Algorithm based on Spark

Jinhong Taoa, Jianhou Ganb, and Bin Wena,*   

  1. a School of Information Science and Technology, Yunnan Normal University, Kunming, 650500, China;
    b Key Laboratory of Educational Informatization for Nationalities of Ministry of Education, Yunnan Normal University, Kunming, 650500, China
  • Submitted on ; Revised on ;
  • Contact: wenbin@ynnu.edu.cn
  • About author:Jinhong Tao is a master's student in the School of Information Science and Technology at Yunnan Normal University. His research interests include machine learning and data mining. Jianhou Gan received his Ph.D. in metallurgical physical chemistry from Kunming University of Science and Technology in 2016. In 1998, he was a faculty member at Yunnan Normal University. Currently, he is a professor at Yunnan Normal University. His research interests cover education informatization for nationalities, semantic Web, database, and intelligent information processing.Bin Wen received his Ph.D. in computer application technology from China University of Mining & Technology in 2013. In 2005, he was a faculty member at Yunnan Normal University. Currently, he is an associate professor at Yunnan Normal University. His research interests cover intelligent information processing and emergency management.

Abstract: With the advent of the era of big data, the problem of information overload has become particularly serious. The recommendation system can provide personalized recommendation services for users by analyzing users' basic information and users' behavior information. How to push information accurately and efficiently has become an urgent issue in the era of big data. Based on the Alternating Least Squares (ALS) collaborative filtering recommendation algorithm, this paper reduces the loss of the invisible factor item attribute information by merging the similarity of the item on the loss function. At the same time, the cold start strategy is introduced into the model to complete the recommendation. The algorithm is implemented on the Spark distributed platform and single node by using the Movie Lens dataset published by the GroupLens Lab. The experiment results show that the proposed recommendation algorithm can preferably alleviate the data sparsity problem compared with the traditional recommendation algorithm. Moreover, the algorithm improves the accuracy of recommendation and the efficiency of calculation.

Key words: recommendation system, collaborative filtering, matrix decomposition, spark