Int J Performability Eng ›› 2018, Vol. 14 ›› Issue (7): 1580-1589.doi: 10.23940/ijpe.18.07.p22.15801589

• Original articles • Previous Articles     Next Articles

A Mongolian Language Model based on Recurrent Neural Networks

Zhiqiang Ma, Li Zhang, Rui Yang, and Tuya Li   

  1. College of Data Science and Application, Inner Mongolia University of Technology, Hohhot, 010080, China

Abstract:

In view of data sparsity and long-range dependence when training the N-Gram Mongolian language model, a Mongolian Language Model based on Recurrent Neural Networks (MLMRNN) is proposed. The Mongolian classified word vector is designed and used as the input word vector of MLMRNN in the pre-training phase, and the Skip-Gram word vector with context information is used at the input layer so that the input contains not only semantic information, but also rich context information. It effectively avoids the problem of data sparsity and long-range dependence. Finally, the training algorithm of MLMRNN is designed and the perplexity is used as the evaluation index of the language model to test the perplexity of N-Gram, RNNLM and MLMRNN on the training set and test set, respectively. The experimental results show that the perplexity of using MLMRNN is lower than that of other language models, and the performance of the language model is improved.


Submitted on April 9, 2018; Revised on May 21, 2018; Accepted on June 23, 2018
References: 23