Int J Performability Eng ›› 2018, Vol. 14 ›› Issue (4): 647-655.doi: 10.23940/ijpe.18.04.p7.647655

• Original articles • Previous Articles     Next Articles

Collision Analysis and an Efficient Double Array Construction Method

Lianyin Jiaa, Wenyan Chena, Jiaman Dinga, Xiaohui Yuanb, Binglin Shena, and Mengjuan Lic   

  1. aFaculty of Information Engineering and Automation, Kunming University of Science and Technology, Kunming, 650500, China
    bCollege of Engineering, University of North Texas, Denton, 76203, USA
    cYunnan Normal University, Kunming, 650500, China

Abstract:

Trie is fundamental to many applications, such as natural language processing, string similarity search and join, but it suffers from a high space overhead. Double array (DA) provides a new way to reduce the space overhead but suffers from low construction efficiency. There is an urgent demand to promote the construction efficiency of DA while maintaining a low memory overhead. To address this problem, we reveal that the collisions generated during DA construction process mainly contribute to the low construction efficiency. Based on this analysis, a partition double array (PDA) is proposed in this paper. PDA can reduce the number of collisions as well as the cost of handling collisions in DA, so higher construction efficiency is guaranteed. Experiments on real dataset indicates that PDAs have a construction efficiency 15x higher than DA. We also obtain a bonus 2.7x higher retrieval efficiency compared with DA.


Submitted on January 2, 2018; Revised on February 13, 2018; Accepted on March 26, 2018
References: 17