Int J Performability Eng ›› 2018, Vol. 14 ›› Issue (12): 3066-3075.doi: 10.23940/ijpe.18.12.p16.30663075

Previous Articles     Next Articles

Chinese Word Segmentation based on Bidirectional GRU-CRF Model

Jinli Che(), Liwei Tang, Shijie Deng, and Xujun Su   

  1. Department of Artillery Engineering, Army Engineering University, Shijiazhuang,050003,China
  • Contact: Che Jinli E-mail:17603200861@163.com

Abstract:

As an effective model for processing time series data, the recurrent neural network has been widely used in the problem of sequence tagging tasks. In order to solve the typical sequence tagging task of Chinese word segmentation, in this paper we propose an improved bidirectional gated recurrent unit conditional random field (BI-GRU-CRF) model based on the gated recurrent unit (GRU) neural network. This network is more easily trained than the LSTM neural network. This method can not only effectively utilize text information in two directions through bidirectional gated recurrent units, but also obtain the globally optimal tagging sequence as a result by considering the correlation between neighbor tags through the conditional random field. In this paper, experiments are carried out on the common evaluation set (PKU, MSRA, CTB) with the four-tag-set and six-tag-set respectively. The results show that the BI-GRU-CRF model has high performance in Chinese word segmentation, and the six-tag-set can improve the performance of the network.

Key words: recurrent neural network, BI-GRU-CRF, Chinese word segmentation, sequence tagging