Int J Performability Eng ›› 2018, Vol. 14 ›› Issue (10): 2312-2320.doi: 10.23940/ijpe.18.10.p7.23122320

• Original articles • Previous Articles     Next Articles

Performance Improvements by Deploying L2 Prefetchers with Helper Thread for Pointer-Chasing Applications

Yan Huanga, Huidong Zhub, and Yuhua Lia   

  1. aCollege of Software Engineering, Zhengzhou University of Light Industry, Zhengzhou, 450003, China
    bCollege of Computer and Communication Engineering, Zhengzhou University of Light Industry, Zhengzhou, 450003, China

Abstract:

Modern processor micro-architecture offers advanced prefetch mechanisms that are designed to effectively hide memory latency and improve application performance. However, pointer-chasing applications employing linked data structures expose a memory latency problem that is difficult to deal with by using hardware prefetchers. It is promising that helper threaded prefetching based on Chip Multiprocessor is an effective method for reducing the memory latency of accesses to linked data structures. In this paper, we first illustrated two L2 prefetchers on Chip Multiprocessor and two different helper threaded prefetching techniques for pointer-chasing applications. Then, we revealed the limitations of L2 prefetchers for pointer-intensive applications after applying two different threaded prefetching techniques. Finally, we optimized the deployment of L2 prefetchers with two different threaded prefetching techniques for pointer-chasing applications. The experimental results indicate that L2 prefetchers’ effectiveness on helper threads depends on the memory access pattern of the targeted applications, and the optimized deployment of L2 prefetchers further improves the performance of pointer-intensive applications.


Submitted on July 10, 2018; Revised on August 12, 2018; Accepted on September 11, 2018
References: 17