|
马上注册,结交更多好友,享用更多功能,让你轻松玩转社区。
您需要 登录 才可以下载或查看,没有账号?注册
x
A Two-Level LoadStore Queue Based on Execution Locality.pdf
(623.02 KB, 下载次数: 22 )
abstract
Multicore processors have emerged as a powerful platform
on which to efficiently exploit thread-level parallelism (TLP).
However, due to Amdahl’s Law, such designs will be increas-
ingly limited by the remaining sequential components of appli-
cations. To overcome this limitation it is necessary to design
processors with many lower–performance cores for TLP and
some high-performance cores designed to execute sequential
algorithms. Such cores will need to address the memory-wall
by implementing kilo-instruction windows.
Large window processors require large Load/Store Queues
that would be too slow if implemented using current CAM-
based designs. This paper proposes an Epoch-based Load
Store Queue (ELSQ), a new design based on Execution Local-
ity. It is integrated into a large-window processor that has a
fast, out-of-order core operating only on L1/L2 cache hits and
N slower cores that process L2 misses and their dependent in-
structions. The large LSQ is coupled with the slow cores and
is partitioned into N small and local LSQs, one per core.
We evaluate ELSQ in a large-window environment, finding
that it enables high performance at low power. By exploiting
locality among loads and stores, ELSQ outperforms even an
idealized central LSQ when implemented on top of a decou-
pled processor design. |
|