在线咨询
eetop公众号 创芯大讲堂 创芯人才网
切换到宽版

EETOP 创芯网论坛 (原名:电子顶级开发网)

手机号码,快捷登录

手机号码,快捷登录

找回密码

  登录   注册  

快捷导航
搜帖子
查看: 7543|回复: 28

续发JSSC上的最新的microprocessor和memory的papers,都是最近3年的

[复制链接]
发表于 2008-7-19 09:09:39 | 显示全部楼层 |阅读模式

马上注册,结交更多好友,享用更多功能,让你轻松玩转社区。

您需要 登录 才可以下载或查看,没有账号?注册

x
昨天太晚了,没有上传完,今天再接着干.(这个帖子是接着昨天的那个帖子的,所以论文的编号就接着来)

PS:大家在看到论文题目的时候,有的可能会有比如 "1MB per s"这样的形式,这个原文献是"1MB/s",因为Windows XP中的文件名不允许出现"/"号,所以就改成那样了,原先的意义没变

1 A 65-nm Dual-Core Multithreaded Xeon® Processor with 16-MB L3 Cache

Abstract-This paper describes a dual-core 64-b Xeon MP processor implemented in a 65-nm eight-metal process. The 435-mm2 die has 1.328-B transistors. Each core has two threads and a unified 1-MB L2 cache. The 16-MB shared, 16-way set-associative L3
cache implements both sleep and shut-off leakage reduction modes.Long channel transistors are used to reduce subthreshold leakage in cores and uncore (all portions of the die that are outside the cores) control logic. Multiple voltage and clock domains are employed to reduce power.


A 65-nm Dual-Core Multithreaded Xeon® Processor with 16-MB L3 Cache.pdf (4.5 MB, 下载次数: 90 )

发表于 2008-7-19 09:22:55 | 显示全部楼层
Thanks
 楼主| 发表于 2008-7-19 09:23:33 | 显示全部楼层
下面是3篇paper,文件都不太大,所以我把3个压缩成一个文件了,给大家省点钱.


2 An 8T-SRAM for Variability Tolerance and Low-Voltage Operation in High-Performance Caches
Abstract-An eight-transistor (8T) cell is proposed to improve variability tolerance and low-voltage operation in high-speed SRAM caches. While the cell itself can be designed for exceptional stability and write margins, array-level implications must also be considered to achieve a viable memory solution. These constraints can be addressed by modifying traditional 6T-SRAM techniques and conceding some design complexity and area penalties. Altogether,8T-SRAM can be designed without significant area penalty over 6T-SRAM while providing substantially improved variability tolerance and low-voltage operation with no need for secondary or dynamic power supplies. The proposed 8T solution is demonstrated in a high-performance 32 kb subarray designed in 65 nm PD-SOI CMOS that operates at 5.3 GHz at 1.2 V and 295 MHz at 0.41 V.

3 Xetal-II_A 107 GOPS, 600 mW Massively Parallel Processor for Video Scene Analysis
Abstract-Xetal-II is a single-instruction multiple-data (SIMD)processor with 320 processing elements. It delivers a peak performance of 107 GOPS on 16-bit data while dissipating 600 mW. A 10 Mbit on-chip memory is provided which can store up to four VGA frames, allowing efficient implementation of frame-iterative algorithms. A massively parallel interconnect provides an internal bandwidth of more than 1.3 Tbit/s to sustain the peak performance.The IC is realized in 90 nm CMOS and takes up 74 mm2.

4 Heterogeneous Multi-Core Architecture That Enables 54x AAC-LC Stereo Encoding
Abstract-This paper describes a heterogeneous multi-core processor (HMCP) architecture that integrates general-purpose processors (CPUs) and accelerators (ACCs) to achieve exceptional performance as well as low-power consumption for the SoCs of embedded
systems. The memory architectures of CPUs and ACCs were unified to improve programming and compiling efficiency.Advanced audio codec-low complexity (AAC-LC) stereo audio encoding was parallelized on a heterogeneous multi-core having homogeneous
processor cores and dynamically reconfigurable processor (DRP) ACC cores in a preliminary evaluation of the HMCP architecture. The performance evaluation revealed that 54x AAC encoding was achieved on the chip with two CPUs at 600 MHz and two DRPs at 300 MHz, which achieved encoding of an entire CD within 1–2 min.


3 papers.rar (4.29 MB, 下载次数: 30 )
 楼主| 发表于 2008-7-19 09:28:19 | 显示全部楼层
5 Hotspot-Limited Microprocessors-Direct Temperature and Power Distribution Measurements

Abstract-An experimental technique is presented, which allows for spatially-resolved imaging of microprocessor power (SIMP).In a first step this method utilizes infrared (IR) thermal imaging,while the processor is effectively cooled using an IR-transparent
heat sink. In the second step the underlying power distribution is derived by determining the temperature fields for each individual power source on the chip. The measured chip temperature distribution is represented as a superposition of these temperature fields.
The SIMP data reveals significant temporal and spatial variations of the microprocessor power/temperature distribution, which can be attributed to the circuit layout as well as to the varying utilization levels across the processor while running full workloads.In this paper we have applied the SIMP method to the dual core PowerPC™970MP microprocessor to measure detailed temperature and power distributions under full operating conditions. In
the first part of the paper the impact of power and temperature limitations of high performance CMOS chips is discussed in detail,where we distinguish between hotspot-limited (or temperature-limited) and power-limited chips. The discussion shows the
importance of temperature and power distributions for chip floor planning, layout, design and architecture. Second, we present the experimental details of the SIMP method, which is applied to the dual core PowerPC970MP to directly measure the temperature and power fields as a function of workload and frequency. A pronounced movement of the hotspot location is observed. Finally, the hotspot of a competitive microprocessor is compared by measuring temperature efficiencies (temperature increase/performance) for the same workloads and cooling conditions.


Hotspot-Limited Microprocessors-Direct Temperature and Power Distribution Measurements.pdf (3.92 MB, 下载次数: 18 )
 楼主| 发表于 2008-7-19 09:31:53 | 显示全部楼层
6 The Design and Implementation of the Massively Parallel Processor Based on the Matrix Architecture

Abstract-This paper describes the design and implementation of the massively parallel processor based on the matrix architecture which is suitable for portable multimedia applications. The proposed architecture in this paper achieves the high performance
of 40 GOPS in the case of consecutive fixed-point 16-bit additions at 200 MHz clock frequency and the small power dissipation of 250 mW. In addition, 1 Mbit SRAM for data registers and 2048 2-bit-grained processing elements connected by a flexible switching
network are integrated in the small area of 3.1 mm2 in 90 nm CMOS low standby technology. These design techniques and architectures described in this paper are attractive for realizing areaefficient,energy-efficient, and high-performance multimedia processors.


本篇论证非常精彩,对于矩阵结构的讲述可谓淋漓尽致,强烈推荐!
abbr_d9b5ba6ddd835a903a94dccadbb6a33a.pdf (2.84 MB, 下载次数: 26 )
 楼主| 发表于 2008-7-19 10:53:55 | 显示全部楼层
 楼主| 发表于 2008-7-19 11:45:14 | 显示全部楼层
 楼主| 发表于 2008-7-19 12:02:07 | 显示全部楼层
 楼主| 发表于 2008-7-19 12:48:31 | 显示全部楼层
 楼主| 发表于 2008-7-19 13:33:15 | 显示全部楼层
您需要登录后才可以回帖 登录 | 注册

本版积分规则

关闭

站长推荐 上一条 /1 下一条


小黑屋| 手机版| 关于我们| 联系我们| 隐私声明| EETOP 创芯网
( 京ICP备:10050787号 京公网安备:11010502037710 )

GMT+8, 2025-7-21 07:46 , Processed in 0.031639 second(s), 10 queries , Gzip On, Redis On.

eetop公众号 创芯大讲堂 创芯人才网
快速回复 返回顶部 返回列表