文章基本信息

标题：Combining Local and Global History for High Performance Data Prefetching
本地全文：下载
作者：Martin Dimitrov ; Huiyang Zhou
期刊名称：The Journal of Instruction-Level Parallelism
电子版ISSN：1942-9525
出版年度：2011
卷号：13
页码：1-14
出版社：International Symposium on Microarchitecture
摘要：In t his paper, we present o ur design of a high performance prefetcher, which exploits various localities in both local cache-miss streams (misses generated from the same instruction) and the global cache-miss address stream (the misses from different instructions). Besides the stride and context localities that have been exploited in previous work, we identify new data localities and incorp orate novel prefetching algorithms into our design. In this work, we also study the (largely overlooked) imp ortance of eliminating redundant prefetches. We use logic to remove local (by the same instruction) redundant prefetches and we use a Bloom filter or miss status handling registers (MSHRs) to remove glob al (by all instructions) redundant prefetches. We evaluate three different design p oints of the proposed architecture, trading o ff performance for complexity and latency efficiency. Our experimental results based on a set of SPEC 2006 benchmarks show that the proposed design significantly improves the performance (over 1.6X for our highest performance design point) at a small hardware cost for various processor, cache and memory bandwidth configurations