Int J Performability Eng ›› 2024, Vol. 20 ›› Issue (12): 741-752.doi: 10.23940/ijpe.24.12.p4.741752
• Original article • Previous Articles Next Articles
Ammar Zakzouka, Bassim Oumranb, and Hasan Hasanb,*()
Submitted on
;
Revised on
;
Accepted on
Contact:
Hasan Hasan
E-mail:h_hasan@albaath-univ.edu.sy
Ammar Zakzouk, Bassim Oumran, and Hasan Hasan. ALLI: A High-Performance Approach to Data Deduplication in Hadoop using Enhanced Hashing and Two-Level Indexing Techniques [J]. Int J Performability Eng, 2024, 20(12): 741-752.
Add to citation manager EndNote|Reference Manager|ProCite|BibTeX|RefWorks
[1] | Memon M.A., Soomro S., Jumani A.K., and Kartio M.A., 2017. Big data analytics and its applications. Arxiv Preprint Arxiv:1710.04135. |
[2] | Oussous A., Benjelloun F.Z., Lahcen A.A., and Belfkih S., 2018. Big data technologies: A survey. Journal of King Saud University-Computer and Information Sciences, 30(4), pp. 431-448. |
[3] | Mazumdar S., Seybold D., Kritikos K., and Verginadis Y., 2019. A survey on data storage and placement methodologies for cloud-big data ecosystem. Journal of Big Data, 6(1), pp. 1-37. |
[4] | Yang C., Huang Q., Li Z., Liu K., and Hu F., 2017. Big data and cloud computing: innovation opportunities and challenges. International Journal of Digital Earth, 10(1), pp. 13-53. |
[5] | Sandhu A.K., 2021. Big data with cloud computing: discussions and challenges. Big Data Mining and Analytics, 5(1), pp. 32-40. |
[6] | Polato I., Ré R., Goldman A., and Kon F., 2014. A comprehensive view of hadoop research—A systematic literature review. Journal of Network and Computer Applications, 46, pp. 1-25. |
[7] | Nakashima K., Fujishima E., and Yamaguchi S., 2016. File placing control for improving the I/O performance of hadoop in virtualized environment. In 2016 Fourth International Symposium on Computing and Networking (CANDAR), pp. 402-407. |
[8] | Hui K., Fan F., Dazhou Y., and Fang M., 2017. The solution of hybrid storage based on hadoop. In 2017 First International Conference on Electronics Instrumentation & Information Systems (EIIS), pp. 1-5. |
[9] | Taylor R.C., 2010. An overview of the hadoop/MapReduce/HBase framework and its current applications in bioinformatics. BMC Bioinformatics, 11, pp. 1-6. |
[10] | Huang L., Liu J., and Meng W., 2018. A review of various optimization schemes of small files storage on hadoop. In 2018 37th Chinese Control Conference (CCC), pp. 4500-4506. |
[11] | Hassan M.U., Yaqoob I., Zulfiqar S., and Hameed I.A., 2021. A comprehensive study of hbase storage architecture—a systematic literature review. Symmetry, 13(1), 109. |
[12] | Kumar P.M., Devi G.U., Basheer S., and Parthasarathy P., 2020. A study on data de-duplication schemes in cloud storage. International Journal of Grid and Utility Computing, 11(4), pp. 509-516. |
[13] | Dev D., and Patgiri R., 2016. A survey of different technologies and recent challenges of big data. In Proceedings of 3rd International Conference on Advanced Computing, Networking and Informatics:ICACNI 2015, Volume 2, pp. 537-548. |
[14] | Viji D., and Revathy S., 2019. Various data deduplication techniques of primary storage. In 2019 International Conference on Communication and Electronics Systems (ICCES), pp. 322-327. |
[15] | Prajapati P., and Shah P., 2022. A review on secure data deduplication: cloud storage security issue. Journal of King Saud University-Computer and Information Sciences, 34(7), pp. 3996-4007. |
[16] | Kumar P.A., Pugazhendhi E., and Nayak R.K., 2022. Cloud storage performance improvement using deduplication and compression techniques. In 2022 4th International Conference on Smart Systems and Inventive Technology (ICSSIT), pp. 443-449. |
[17] | Kirubakaran R., Prathibhan C.M., and Karthika C., 2015. A cloud based model for deduplication of large data. In 2015 IEEE International Conference on Engineering and Technology (ICETECH), pp. 1-4. |
[18] | Vijayalakshmi K., and Jayalakshmi V., 2021. Analysis on data deduplication techniques of storage of big data in cloud. In 2021 5th International Conference on Computing Methodologies and Communication (ICCMC), pp. 976-983. |
[19] | Kathiravan M., Logeshwari R., Pavithra S., Meenakshi M., Durga V.S., and Vijayakumar M., 2023. A cloud based improved file handling and duplicate removal using md5. In 2023 Third International Conference on Artificial Intelligence and Smart Energy (ICAIS), pp. 1532-1536. |
[20] | Maruti M.V., and Nighot M.K., 2015. Authorized data deduplication using hybrid cloud technique. In 2015 International Conference on Energy Systems and Applications, pp. 695-699. |
[21] | Motegaonkar S.B., and Kulkarni C.S., 2016. To develop secure deduplication of data using hybrid cloud methodology. In 2016 International Conference on Electrical, Electronics, and Optimization Techniques (ICEEOT), pp. 1759-1762. |
[22] | Kumar P.A., Pugazhendhi E., and Lakshmi K.V., 2022. Cloud data storage optimization by using novel de-duplication technique. In 2022 4th International Conference on Smart Systems and Inventive Technology (ICSSIT), pp. 436-442. |
[23] | Ranjitha S., Sudhakar P., and Seetharaman K.S., 2016. A novel and efficient de-duplication system for HDFS. Procedia Computer Science, 92, pp. 498-505. |
[24] | Rasjid Z.E., Soewito B., Witjaksono G., and Abdurachman E., 2017. A review of collisions in cryptographic hash function used in digital forensic tools. Procedia Computer Science, 116, pp. 381-392. |
[25] | Long S., 2019. A comparative analysis of the application of hashing encryption algorithms for MD5, SHA-1, and SHA-512. In Journal of Physics:Conference Series(Vol. 1314, No. 1, 012210. |
[26] | Prathyusha D.J., and Govinda K., 2019. Securing virtual machines from DDoS attacks using hash-based detection techniques. Multiagent and Grid Systems, 15(2), pp. 121-135. |
[27] | Deepa D., Parvez Y., Dheeraj Y., Ponraj A., and Roobini M.S., 2021. Data deduplication on multi-domain big data to overcome communication overheads. In Advances in Power Systems and Energy Management:Select Proceedings of ETAEERE 2020, pp. 557-564. |
[28] | Sharma N., Prasad A.K., and Kakulapati V., 2021. File-level deduplication by using text files-hive integration. In 2021 International Conference on Computer Communication and Informatics (ICCCI), pp. 1-6. |
[29] | Waghmare V., and Kapse S., 2016. Authorized deduplication: an approach for secure cloud environment. Procedia Computer Science, 78, pp. 815-823. |
[30] | Jain P., Mishra V.K., and Kumar V., 2022. An enhanced algorithm to improve the security in cloud using hybrid hash algorithm. In 2022 11th International Conference on System Modeling & Advancement in Research Trends (SMART), pp. 1564-1570. |
[31] | Vengala D.V.K., Kavitha D., and Kumar A.S., 2020. Secure data transmission on a distributed cloud server with the help of HMCA and data encryption using optimized CP-ABE-ECC. Cluster Computing, 23(3), pp. 1683-1696. |
[32] | Chavhan S., Patil P., and Patle G., 2020. Implementation of improved inline deduplication scheme for distributed cloud storage. In 2020 5th International Conference on Communication and Electronics Systems (ICCES), pp. 1406-1410. |
[33] | Hassen M., Carvalho M.M., and Chan P.K., 2017. Malware classification using static analysis based features. In 2017 IEEE Symposium Series on Computational Intelligence (SSCI), pp. 1-7. |
[34] | Wang J., Zhao Z., Xu Z., Zhang H., Li L., and Guo Y., 2015. I-sieve: an inline high performance deduplication system used in cloud storage. Tsinghua Science and Technology, 20(1), pp. 17-27. |
[1] | Sonia Sharma and Rajendra Kumar Bharti. Intelligent Job Allocation and Adaptive Migration in Cloud Environments using a Dynamic Dual-Threshold Strategy [J]. Int J Performability Eng, 2025, 21(3): 168-177. |
[2] | Rohit Kumar Verma and Sukhvir Singh. A Hybrid Framework of Resource Allocation using Firefly and Deep Learning in Big Data Scheduling [J]. Int J Performability Eng, 2024, 20(6): 333-343. |
[3] | Neha Kashyap, Sapna Sinha, and Vineet Kansal. A Hybrid Lightweight Method of ABE with SHA1 Algorithm for Securing the IoT Data on Cloud [J]. Int J Performability Eng, 2024, 20(3): 131-138. |
[4] | Jayanthi M and K. Ram Mohan Rao. Efficient Resource Managing and Job Scheduling in a Heterogeneous Kubernetes Cluster for Big Data [J]. Int J Performability Eng, 2024, 20(3): 157-166. |
[5] | V. Sudha and Anna Saro Vijendran. OSD-DNN: Oil Spill Detection using Deep Neural Networks [J]. Int J Performability Eng, 2024, 20(2): 57-67. |
[6] | Vipan and Raj Kumar. Hybrid Fuzzy-Neuro and DNN-Based Framework for VM Allocation and Resource Optimization in Cloud Systems [J]. Int J Performability Eng, 2024, 20(12): 733-740. |
[7] | Khushi Wadhwa and Himanshi Babbar. Digital Twin in the Motorized (Automotive / Vehicle) Industry [J]. Int J Performability Eng, 2023, 19(9): 568-578. |
[8] | Savita Khurana, Gaurav Sharma, and Bhawna Sharma. Hybrid Machine Learning Model for Load Prediction in Cloud Environment [J]. Int J Performability Eng, 2023, 19(8): 507-515. |
[9] | Me Me Khaing and N. Jeyanthi. EDocDeDup: Electronic Document Data Deduplication Towards Storage Optimization [J]. Int J Performability Eng, 2023, 19(7): 471-480. |
[10] | Aditi Sharma and Parmeet Kaur. A Survey of Distributed Data Storage in the Cloud for Multitenant Applications [J]. Int J Performability Eng, 2023, 19(3): 184-192. |
[11] | Megha Gupta, Laxmi Ahuja, and Ashish Seth. A Novel Multi-Objective Cat Swarm Technique for an Efficient Cloud Manager for Data Handling in Cloud Environment [J]. Int J Performability Eng, 2023, 19(3): 216-222. |
[12] | Sushant Jhingran, Mayank Kumar Goyal, and Nitin Rakesh. DQLC: A Novel Algorithm to Enhance Performance of Applications in Cloud Environment [J]. Int J Performability Eng, 2023, 19(12): 771-778. |
[13] | Amanpreet Singh and Jyoti Batra. Strategies for Data Backup and Recovery in the Cloud [J]. Int J Performability Eng, 2023, 19(11): 728-735. |
[14] | Priyanshu Verma, Ishan Sharma, Sonia Deshmukh, and Rohit Vashisht. Customer Churn Analysis using Spark and Hadoop [J]. Int J Performability Eng, 2023, 19(10): 663-675. |
[15] | Priti Kumari and Parmeet Kaur. An Adaptable Approach to Fault Tolerance in Cloud Computing [J]. Int J Performability Eng, 2023, 19(1): 43-54. |
|