Int J Performability Eng ›› 2021, Vol. 17 ›› Issue (2): 178-190.doi: 10.23940/ijpe.21.02.p2.178190
• Orginal Article • Previous Articles Next Articles
Naina Nisar*, Nitin Rakesh, and Megha Chhabra
Contact:
* Corresponding author. E-mail address: Naina Nisar, Nitin Rakesh, and Megha Chhabra. Review on Email Spam Filtering Techniques [J]. Int J Performability Eng, 2021, 17(2): 178-190.
Add to citation manager EndNote|Reference Manager|ProCite|BibTeX|RefWorks
1. Cormack GV, “Email Spam Filtering: A Systematic Review”, Foundations and Trends in Information Retrieval 1(4):335-455, 2008. 2. D.M. Fonseca, O.H. Fazzion, E. Cunha, I. Las-casas, P.D. Guedes, W. Meira, M. Chaves, “Measuring characterizing, and avoiding spam traffic costs”, IEEE Int. Comp., 99, 2016. 3. Sarwat Nizamani, Nasrullah Memon, Uffe Kock Wiil, Panagiotis Karampelas, “Modeling Suspicious Email Detection using Enhanced Feature Selection”, IJMO2012, Vol. 2(4): 371-377 ISSN: 2010-3697, 2013. 4. P. Sahil, G. Dishant, A. Mehak, K. Ishita, J. Nishtha, “Comparison and analysis of spam detection algorithms”, Int. J. Appl. Innov. Eng. Manag.(IJAIEM), 2(4), pp. 1-7, 2013. 5. T.S. Guzella, W.M. Caminhas, “A review of machine learning approaches to spam filtering”, Expert Syst. Appl., 36 (7) (2009), pp. 10206-10222, 2009. 6. Diale M., Celik T., andVan Der Walt C., “Unsupervised feature learning for spam email filtering.Computers & Electrical Engineering”. vol. 74, pp. 89-104, 2019. 7. Rusland N. F., Wahid N., Kasim S., & Hafit, H., “Analysis of Naïve Bayes Algorithm for Email Spam Filtering across Multiple Datasets”, IOP Conference Series: Materials Science and Engineering, 226, 012091. doi:10.1088/1757-899x/226/1/012091, 2017. 8. Dedeturk B.K and Bahriye Akay, “Spam filtering using a logistic regression model trained by an artificial bee colony algorithm”, Appl. Soft Comput. 106229, 2020. 9. Herrero A, Corchado E, Pellicer MA, Abraham A., “MOVIHIDS: a mobile-visualization hybrid intrusion detection system”, Neurocomputing 72(13-15):2775-2784, 2009. 10. Guzella TS, Caminhas WM, “A Review of Machine Learning Approaches to Spam Filtering”, Expert Systems with Applications 36(7):10,206-10,222, 2009. 11. Diao Y, Lu H, Wu D, “A Comparative Study of Classification Based Personal E-mail Filtering”. In: Knowledge Discovery and Data Mining, Current Issues and New Applications, pp 408-419. 2003. 12. Shi L, Wang Q, Ma X, Weng M, Qiao H, “Spam Email Classification Using Decision Tree Ensemble”, Journal of Computational Information Systems: 949-956, 3 Feb, 2012. 13. Gansterer WN, Ecker GF, “On the Relationship Between Feature Selection and Classification Accuracy”, Journal of Machine Learning Research 4:90-105, 2008. 14. Zhang L, Zhu J, Yao T, “An Evaluation of Statistical Spam Filtering Techniques Spam Filtering as Text Categorization”, ACM Transactions on Asian Language Information Processing (TALIP) 3(4):243-269, 2004. 15. Abdi, H., “Principal component analysis”, Computational Statistics 2, 433-459 (2010). 16. J. Xie, W. Chen, D. Zhang, S. Zu and Y. Chen, “Application of Principal Component Analysis in Weighted Stacking of Seismic Data”, in IEEE Geoscience and Remote Sensing Letters, vol. 14, no. 8, pp. 1213-1217, Aug. 2017. 17. F. Qian, A. Pathak, Y. C. Hu, Z. M. Mao,Y. Xie, “A case for unsupervised-learning-based spam filtering”, In Proc. of SIGMETRICS, 2010. 18. Turney, Peter D. and Pantel, Patrick, “From frequency to meaning: Vector space models of semantics”, Journal of Artificial Intelligence Research, 2010. 19. Yeh C.Y., Wu C.H., Doong S.H., “Effective spam classification based on meta-heuristics”, In: Proceedings of IEEE International Conference on Systems, Man and Cybernetics, pp. 3872-3877 (2005). 20. M´endez J.R., D´ıaz F., Iglesias E.L., Corchado J.M., “A comparative performance study of feature selection methods for the anti-spam filtering domain”, In: Advances in Data Mining. Applications in Medicine, Web Mining, Marketing, Image and Signal Mining, pp. 106-120. Springer, Berlin, Heidelberg (2006). 21. K. Tretyakov, “Machine learning techniques in spam filtering”, Data Mining Problem-oriented Seminar, MTAT.03.177, May 2004. 22. Ching-Tung Wu, Kwang-Ting Cheng, Qiang Zhu and Yi-Leh Wu, "Using visual features for anti-spam filtering", IEEE International Conference on Image Processing2005, Genova, pp. III-509, 2005. 23. W. Li, N. Zhong, Y. Yao, J. Liu, C. Liu, “Spam filtering and email-mediated applications”, Paper presented at the International Workshop on Web Intelligence Meets Brain Informatics, 2006. 24. Sculley D, Wachman GM, “Relaxed online SVMs for spam filtering”, In: Proceedings of the 30th annual international ACM SIGIR conference on research and development in information retrieval. pp 415-422, 2007. 25. E. M.El-Alfy,"Discovering classification rules for email spam filtering with an ant colony optimization algorithm," IEEE Congress on Evolutionary Computation, Trondheim, pp. 1778-1783, 2009. 26. O. Amayri, N. Bouguila, “A study of spam filtering using support vector machines”, Artif. Intell.Rev. 34(1) 73-108, 2010. 27. Al-jarrah O, Khater I, Al-duwairi B, “Identifying Potentially Useful Email Header Features for Email Spam Filtering”, In: The Sixth International Conference on Digital Society, c, pp 140-145, 2012. 28. S. Dhanaraj, V. Karthikeyani, “A study on e-mail image spam filtering techniques”, In: International Conference on Pattern Recognition, Informatics and Mobile Engineering (PRIME), 2013. 29. C. Laorden, X. UgartePedrero, I. Santos, B. Sanz, J. Nieves, P.G. Bringas, “Study on the effectiveness of anomaly detection for spam filtering”, Inf. Sci., 277, pp. 421-444, 2014. 30. G. Mi, Y. Gao,Y. Tan, “Apply stacked auto-encoder to spam detection,” In: International Conference in Swarm Intelligence, 2015. 31 31.A. Bhowmick, S.M. Hazarika, “Machine Learning for E-Mail Spam Filtering: Review, Techniques and Trends”, arXiv:1606.01042v1 [cs.LG] 3 Jun (2016), pp. 1-27. 32. Barushka. A., Hajek. P., “Spam filtering using integrated distribution-based balancing approach and regularized deep neural networks”, In: Applied Intelligence, 2018. 33. Gaurav D., Tiwari S.M., Goyal A., Gandhi N., Abraham A., “Machine intelligence-based algorithms for spam filtering on document labeling”, Soft Comput., 2019. 34. T.M. Mitchell,“Machine Learning (first ed.)”, McGraw-Hill, 1997. 35. Patil, T. and Sherekar, S., “Performance Analysis of Na¨ıve Bayes and Classification Algorithm for Data Classification”, International Journal Of Computer Science And Applications, 2013. 36. G. Bandana, “Design and Development of Naïve Bayes Classifier”, North Dakota State University of Agriculture and Applied Science Graduate Faculty of Computer Science, Master thesis, 2013. 37. Wu X, Kumar V, Ross Quinlan J, Ghosh J, Yang Q, Motoda H, McLachlan GJ, Ng A, Liu B, Yu PS, Zhou ZH, Steinbach M, Hand DJ, Steinberg D, “Top 10 Algorithms in Data Mining”, vol 14., 2017. 38. D. Sculley, G. WachmanW.Kraaij, A.P. deVries, C.L.A. Clarke, N. Fuhr, N. Kando (Eds.), “Relaxed Online SVMs for Spam Filtering”, SIGIR, ACM, pp. 415-422, 2007. 39. K. Li, Z. Zhong, “Fast statistical spam filter by approximate classifications”, Proceedings of the Joint International Conference on Measurement and Modeling of Computer Systems, Saint Malo, France, 2006. 40. Rios G, Zha H, “Exploring Support Vector Machines and Random Forests for Spam Detection”, In: Conference on e-mail and anti-spam(CEAS), pp 5-10, 2004. 41. A. Edstrom, “Detecting Spam with Artificial Neural Networks”, Retrieved on August 10, 2017 from 2016. 42. A. Chandra, S. Mohammad, B. RizwanWeb, “spam classification using supervised artificial neural network algorithms”, Adv. Comput. Intell.: Int. J. (ACII), 2 (1) (2015), pp. 21-30. 43. Guerra PHC, Guedes D, Meira JW, Hoepers C, Chaves M, Steding- Jessen K, “Exploring the spam arms race to characterize spam evolution”, In: Proceedings of the 7th collaboration, electronic messaging, anti-abuse and spam conference (CEAS), Redmond, 2010. 44. L. Breiman, “Bagging predictors”, Mach. Learn., 24(2), pp. 123-140, 1996. 45. B. Biggio, I. Corona, G. Fumera, G. Giacinto, F. RoliBagging, “Classifiers for fighting poisoning attacks in adversarial classification tasks Multiple Classifier Systems”,Springer Berlin Heidelberg (2011), pp. 350-359. 46. Netsanet S, Zhang J, Zheng D, “Bagged decision trees based scheme of microgrid protection using windowed fast fourier and wavelet transforms”, Electronics 7(5):61, 2018. 47. Chhabra M., Shukla M.K., Ravulakollu K.K., “Bagging- and Boosting-Based Latent Fingerprint Image Classification and Segmentation”, In: Gupta D., Khanna A., Bhattacharyya S., Hassanien A., Anand S., Jaiswal A. (eds) International Conference on Innovative Computing and Communications. Advances in Intelligent Systems and Computing, vol 1166. Springer, Singapore, 2021. 48. J.R. Mendez, F. Díaz, E.L. Iglesias, J.M. Corchado., “A comparative performance study of feature selection methods for the anti-spam filtering domain Advances in Data Mining”, Applications in Medicine, Web Mining, Marketing, Image and Signal Mining, Springer Berlin Heidelberg (2006), pp. 106-120. 49. B. Biggio, I. Corona, G. Fumera, G. Giacinto, F. Roli, Bagging classifiers for fighting poisoning attacks in adversarial classification tasks Multiple Classifier Systems, Springer Berlin Heidelberg (2011), pp. 350-359. 50. J. Friedman, T. Hastie, R. Tibshirani, “Additive logistic regression: a statistical view of boosting Ann”, Stat., 38 (2) (2000). 51. Gangavarapu T., Jaidhar, C.D. & Chanduka, B., “Applicability of machine learning in spam and phishing email filtering: review and approaches” Artif Intell Rev (2020). 52. T. Fawcett, “An introduction to ROC analysis”, Pattern Recogn., Lett., 27 (8) (2006), pp. 861-874. 53. G. Sakkis, I. Androutsopoulos, G. Paliouras, V. Karkaletsis, “ Stacking classifiers for anti-spam filtering of E-mail Empirical Methods in Natural Language Processing”, (2001), pp. 44-50. 54. I. Androutsopoulos, G. Paliouras, E. Michelakis, “Learning to Filter Unsolicited Commercial E-Mail”, Tech. Rep. National Centre for Scientific Research Demokritos, Athens, Greece (2011). 55. Mathswork Detector, “Performance Analysis Using ROC Curves”, - MATLAB & Simulink Example Retrieved August 11, 2017 from (2016). 56. I. Androutsopoulos, J. Koutsias, K.V. Chandrinos, C.D. Spyropoulos, “An experimental comparison of naïve Bayesian and keyword-based anti-spam filtering with personal e-mail messages”, Proc of the Ann Int ACM SIGIR Conf on Res and Devel in Inform Retrieval (2000). 57. W.A. Awad, S.M. Elseuofi, “Machine learning methods for spam E-mail classification”, Int. J. Comput. Sci. Inf. Technol., 3 (1) (2011), pp. 173-184. 58. I. Idris, A.S. Muhammad, “An improved AIS based E-mail classification technique for spam detection”, Proceedings of the Eight International Conference on eLearning for Knowledge-Based Society, Thailand (2012). 59. J.N. Shrivastava, M.H. Bindu, “E-mail classification using genetic algorithm with heuristic fitness function”, Int. J. Comput. Trends Technol., 4 (8) (2013), pp. 2956-2961. 60. Sharma AK, Prajapat SK, Aslam M,“A comparative study between naïve bayes and neural network (mlp) classifier for spam email detection”, Int J Comput Appl., 2014. 61. Renuka DK, Visalakshi P, Sankar T, “Improving e-mail spam classification using ant colony optimization algorithm”, Int J Comput Appl 22-26, 2015. 62. M. Zavvar, M. Rezaei, S. Garavand, “Email spam detection using combination of particle swarm optimization and artificial neural network and support vector machine”, Int. J. Mod. Educ. Comput. Sci. (2016), pp. 68-74. 63. Akshita Tyagi, “Content Based Spam Classification- A Deep Learning Approach”, A Thesis Submitted To The Faculty Of Graduate Studies University Of Calgary, Alberta, Canada (2016). 64. S.P. Rajamohana, K. Umamaheswari, B. Abirami, “Adaptive binary flower pollination algorithm for feature selection in review spam detection”, IEEE International Conference on Innovations in Green Energy and Healthcare Technologies (2017), pp. 1-4. 65. M. Ott, Y. Choi, C. Cardie, J.T. Hancock, “Finding deceptive opinion spam by any stretch of imagination ACM”, Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, 1(2011), pp. 309-319. 66. Bassiouni M, Ali M, El-Dahshan EA, “Ham and spam e-mails classification using machine learning techniques”, J Appl Secur Res 13(3):315-331, 2018. 67. Merugu S, Reddy MCS, Goyal E, Piplani L, “Text message classification using supervised machine learning algorithms”, In: Kumar A, Mozar S (eds) ICCCE 2018. ICCCE 2018. Lecture Notes in Electrical Engineering, vol 500. Springer, Singapore, p (2019). |
[1] | Ashu Mehta, Navdeep Kaur, and Amandeep Kaur. A Review of Software Fault Prediction Techniques in Class Imbalance Scenarios [J]. Int J Performability Eng, 2025, 21(3): 123-130. |
[2] | Vikas, Charu Wahi, Bharat Bhushan Sagar, and Manisha Manjul. Trust Management in WSN using ML for Detection of DDoS Attacks [J]. Int J Performability Eng, 2025, 21(3): 157-167. |
[3] | Arpna Saxena and Sangeeta Mittal. CluSHAPify: Synergizing Clustering and SHAP Value Interpretations for Improved Reconnaissance Attack Detection in IIoT Networks [J]. Int J Performability Eng, 2025, 21(1): 36-47. |
[4] | Seema Kalonia and Amrita Upadhyay. Comparative Analysis of Machine Learning Model and PSO Optimized CNN-RNN for Software Fault Prediction [J]. Int J Performability Eng, 2025, 21(1): 48-55. |
[5] | Vikas Kumar, Charu Wahi, Bharat Bhushan Sagar, and Manisha Manjul. Ensemble Learning Based Intrusion Detection for Wireless Sensor Network Environment [J]. Int J Performability Eng, 2024, 20(9): 541-551. |
[6] | Kalyani H. Deshmukh, Gajendra R. Bamnote, and Pratik K Agrawal. A Novel Approach for Drought Monitoring and Evaluation using Time Series Analysis and Deep Learning [J]. Int J Performability Eng, 2024, 20(8): 498-509. |
[7] | Saurabh Saxena, and Chetna Gupta. Optimizing Bug Resolution: A Data-Driven Developer Recommendation System [J]. Int J Performability Eng, 2024, 20(8): 510-519. |
[8] | Lakshya Vaswani, Sai Sri Harsha, Subham Jaiswal, and Aju D. Unravelling Complexity: Investigating the Effectiveness of SHAP Algorithm for Improving Explainability in Network Intrusion System Across Machine and Deep Learning Models [J]. Int J Performability Eng, 2024, 20(7): 421-431. |
[9] | Meenakshi Chawla and Meenakshi Pareek. A Hybrid Deep Learning Perspective for Software Effort Estimation [J]. Int J Performability Eng, 2024, 20(7): 442-450. |
[10] | Ajeet Kumar Sharma and Rakesh Kumar. IoT Malware Detection and Dynamic Analysis of MQTT Simulated Network [J]. Int J Performability Eng, 2024, 20(7): 451-459. |
[11] | Abhishek Gupta and Jaspreet Singh. Data-Driven Security Framework for VANET using Firefly and ANN [J]. Int J Performability Eng, 2024, 20(6): 344-354. |
[12] | Vikas Verma, Arun Malik, and Isha Batra. Analyzing and Classifying Malware Types on Windows Platform using an Ensemble Machine Learning Approach [J]. Int J Performability Eng, 2024, 20(5): 312-318. |
[13] | Harshita Batra and Leema Nelson. ESD: E-mail Spam Detection using Cybersecurity-Driven Header Analysis and Machine Learning based Content Analysis [J]. Int J Performability Eng, 2024, 20(4): 205-213. |
[14] | Manu Jyoti Gupta and Parveen Sehgal. Optimizing Credit Card Fraud Detection: Classifier Performance and Feature Selection Empowered by Grasshopper Algorithm [J]. Int J Performability Eng, 2024, 20(3): 177-185. |
[15] | Aparna Shrivastava and P Raghu Vamsi. Improving Anomaly Classification using Combined Data Transformation and Machine Learning Methods [J]. Int J Performability Eng, 2024, 20(2): 68-80. |
|