Using Cross-Entropy Value of Code for Better Defect Prediction

doi:10.23940/ijpe.18.09.p19.21052115

Abstract

Abstract: Defect prediction is meaningful because it can assist software inspection by predicting defective code locations and improving software reliability. Many software features are designed for defect prediction models to identify potential bugs, but no one feature set can perform well in most cases yet. To improve defect prediction, this paper proposes a new code feature, the cross-entropy value of the sequence of code’s abstract syntax tree nodes (CE-AST), and develops a neural language model for feature measurement. To evaluate the effectiveness of CE-AST, we first investigate its discrimination for defect-proneness. Experiments on 12 Java projects show that CE-AST is more discriminative than 45% of twenty widely used traditional features. Furthermore, we investigate CE-AST’s contribution to defect prediction. Combined with different traditional feature suites to feed prediction models, CE-AST can bring performance improvements of 4.7% in Precision, 2.5% in Recall, and 3.5% in F1 on average.

Key words: software reliability, defect prediction, natural language processing, language model, code naturalness, cross-entropy

Xian Zhang, Kerong Ben, and Jie Zeng. Using Cross-Entropy Value of Code for Better Defect Prediction [J]. Int J Performability Eng, 2018, 14(9): 2105-2115.

Add to citation manager EndNote|Reference Manager|ProCite|BibTeX|RefWorks

References

[1] T. Hall, S. Beecham, D. Bowes, D. Grayc,S. Counsell, “A Systematic Literature Review on Fault Prediction Performance in Software Engineering,” IEEE Transactions on Software Engineering, Vol. 38, No. 6, pp. 1276-1304, November 2012
[2] D. Radjenović, M. Heričko, R. Torkar,A. Živkovič, “Software Fault Prediction Metrics: A Systematic Literature Review,” Information and Software Technology, Vol. 55, No. 8, pp. 1397-1418, August 2013
[3] S. Y. Lee, D. Li,Y. Li, “An Investigation of Essential Topics on Software Fault-Proneness Prediction,” inProceedings of the 2nd International Symposium on System and Software Reliability (ISSSR), pp. 37-46, Shanghai, China, October 2016
[4] T. J.McCabe, “A Complexity Measure,” IEEE Transactions on Software Engineering, Vol. 2, No. 4, pp. 308-320, December 1976
[5] S. R.Chidamber and C. F. Kemerer, “A Metrics Suite for Object Oriented Design,” IEEE Transactions on software engineering, Vol. 20, No. 6, pp. 476-493, June 1994
[6] J. Bansiya and C. G. Davis, “A Hierarchical Model for Object-Oriented Design Quality Assessment,” IEEE Transactions on Software Engineering, Vol. 28, No. 1, pp. 4-17, January 2002
[7] T. Jiang, L. Tan,S. Kim, “Personalized Defect Prediction,” inProceedings of the 28th IEEE/ACM International Conference on Automated Software Engineering (ASE), pp. 279-289, Silicon Valley, California, USA, November 2013
[8] L. Madeyski and M. Jureczko, “Which Process Metrics Can Significantly Improve Defect Prediction Models? An Empirical Study,” Software Quality Journal, Vol. 23, No. 3, pp. 393-422, September 2015
[9] S. Wang, T. Liu,L. Tan, “Automatically Learning Semantic Features for Defect Prediction,” inProceedings of the 38th International Conference on Software Engineering (ICSE), pp. 297-308, Austin, Texas, USA, May 2016
[10] J. Li, P. He, J. Zhu,R. L. Michael, “Software Defect Prediction via Convolutional Neural Network,” inProceedings of the 17th IEEE International Conference on Software Quality, Reliability and Security (QRS), pp. 318-328, Prague, Czech Republic, August 2017
[11] A. Hindle, E. T. Barr, Z. Su, M. Gabel,P. Devanbu, “On the Naturalness of Software,” inProceedings of the International Conference on Software Engineering (ICSE), pp. 837-847, Zurich, Switzerland, June 2012
[12] B. Ray, V. Hellendoorn, S. Godhane, Z. Tu, A. Bacchelli,P. Devanbu, “On the Naturalness of Buggy Code,” inProceedings of the 38th International Conference on Software Engineering (ICSE), pp. 428-439, Austin, Texas, USA, May 2016
[13] S. Wang, D. Chollak, D. Movshovitz-Attias,L. Tian, “Bugram: Bug Detection with N-gram Language Models,” inProceedings of the 31st IEEE/ACM International Conference on Automated Software Engineering (ASE), pp. 708-719, Singapore, Octobor 2016
[14] M. Allamanis, E. T. Barr, P. Devanbu,C. Sutton, “A Survey of Machine Learning for Big Code and Naturalness,” arXiv Preprint, arXiv: 1709.06182, September 2017
[15] X. Zhang, K. Ben,J. Zeng, “Cross-Entropy: A New Metric for Software Defect Prediction,” in Proceedings of the 18th IEEE International Conference on Software Quality, Reliability and Security (QRS), Lisbon, Portugal, (https://github.com/TOM-ZXian/-Cross-entropy-metric-of-code-for-defect-prediction, accessed July 2018)
[16] D. Jurafsky and J. H. Martin, “Speech and Language Processing,” 2nd Edition, Pearson/Prentice Hall, Upper Saddle River, 2009
[17] Y. LeCun, Y. Bengio,G. Hinton, “Deep Learning,” Nature, Vol. 512, No. 7553, pp. 436-444, May 2015
[18] Y. Bengio, R. Ducharme,P. Vincent, “A Neural Probabilistic Language Model,” inProceedings of Advances in Neural Information Processing Systems (NIPS), pp. 932-938, Vancouver, Canada, December 2001
[19] J. Hirschberg and C. D. Manning, “Advances in Natural Language Processing,” Science, Vol. 349, No. 6245, pp. 261-266, July 2015
[20] H. Salehinejad, J. Baarbe, S. Sankar, J. Barfett, E. Colak,S. Valaee, “Recent Advances in Recurrent Neural Networks,” arXiv preprint, arXiv: 1801.01078, January 2018
[21] T. Mikolov, M. Karafiát, L. Burget, J. Černocký,S. Khudanpur, “Recurrent Neural Network based Language Model,” inProceedings of the 11th Annual Conference of the International Speech Communication Association (INTERSPEECH), pp. 1045-1048, Makuhari, Chiba, Japan, September 2010
[22] T. Mikolov, W. T. Yih,G. Zweig, “Linguistic Regularities in Continuous Space Word Representations,” inProceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 746-751, Atlanta, US, June 2013
[23] A. Agrawal and T. Menzies, “Is ‘Better Data’ Better Than ‘Better Data Miners’?: On the Benefits of Tuning SMOTE for Defect Prediction,” in Proceedings of the 40th International Conference on Software Engineering (ICSE), Gothenburg, Sweden, May 2018.
[24] S. Hochreiter and J. Schmidhuber, “Long Short-Term Memory,” Neural Computation, Vol. 9, No. 8, pp. 1735-1780, November 1997
[25] W. Zaremba, I. Sutskever,O. Vinyals.“Recurrent Neural Network Regularization,” arXiv preprint, arXiv: 1409.2329, September 2014
[26] R. Pascanu, T. Mikolov,Y. Bengio, “On the Difficulty of Training Recurrent Neural Networks,” inProceedings of the 30th International Conference on Machine Learning (ICML), pp. 1310-1318, Atlanta, USA, June 2013
[27] I. H. Laradji, M. Alshayeb,L. Ghouti, “Software Defect Prediction using Ensemble Learning on Selected Features,” Information and Software Technology, Vol. 58, pp. 388-402, February 2015
[28] M. Jureczko,L. Madeyski, “Towards Identifying Software Project Clusters with Regard to Defect Prediction,” in Proceedings of the 6th International Conference on Predictive Models in Software Engineering, Timişoara, Romania, September 2010
[29] B. Ghotra, S. McIntosh,A. E. Hassan, “A Large-Scale Study of the Impact of Feature Selection Techniques on Defect Classification Models,” inProceedings of the 14th IEEE/ACM International Conference on Mining Software Repositories (MSR), pp. 146-157, Buenos Aires, Argentina, May 2017
[30] C. M. Bishop, “Pattern Recognition and Machine Learning,” Springer, New York, 2006
[31] X. Yang, D. Lo, X. Xia, Y. Zhang,J. Sun, “Deep Learning for Just-In-Time Defect Prediction,” inProceedings of the 15th IEEE International Conference on Software Quality, Reliability and Security (QRS), pp. 17-26, Vancouver, Canada, August 2015