Int J Performability Eng ›› 2019, Vol. 15 ›› Issue (2): 667-675.doi: 10.23940/ijpe.19.02.p31.667675

Previous Articles     Next Articles

Short Text Classification based on Feature Extension using Information in Images

Shengjie Zhaoab*() and Qianyun Jianga   

  1. a College of Electronic and Information Engineering, Tongji University,Shanghai,200800,China
    b School of Software Engineering, Tongji University, Shanghai,200800,China
  • Revised on ; Accepted on
  • Contact: Zhao Shengjie E-mail:shengjiezhao@tongji.edu.cn

Abstract: With the quick development and extensive application of the Internet, there is a growing desire for people to share their life or opinions on social networks, which producesa mass of short texts. Short texts are characterized by short length, sparse features, and a lack of contextual information. Thus, it is difficultfor conventional methods to achieve high quality classification performance. To achieve a higher classification accuracy, this paper proposes a novel short text classification method based on feature extension by incorporating the information of the images.Specifically, we first generate a sentence that descripts the images by image caption technology, and then we combine the generated sentence with the text as the input of the classifier. Meanwhile, we introduce a similarity module in terms of the correlation between the image and the short text so as to determine whether the two sentences are combined or not. Simulation results show that ourproposed model significantly out performs the state-of-the-art methods in terms of classification accuracy.

Key words: short text classification, image caption, feature extension, sentence similarity