Intel TeraFlop Research Chip: A Glimpse of Things to Come

Intel Mar 14, 2007

This week Intel announced a test chip for its Terascale Computing Research program. The objective is to eventually bring super-computing into mainstream computing.

The Teraflops Research Chip is an 80 core design. Keep in mind that this is not an MPU architecture ready for commercialization but a test vehicle. An individual core does not have the same capability as the current complex cores in the 80-86 architecture. They are relatively simple in design.

A tile includes a compute element and a router to connect to each nearest neighbor and to an SRAM chip in the "z" direction. The SRAM chip is a stacked memory above the MPU. The actual processing element consists of data memory, instruction memory, and two floating point engines. Any tile can communicate with any other tile via the router, 80 GB/sec. This is NOT an IA instruction set (standard x86) but a VLIW. This 80-core chip is NOT a product but a research vehicle. The research chip has 100 million transistors and is manufactured on Intel's standard 65nm process.

A key feature is the clocking scheme. Intel calls it mesochronous clocking. This is a modular clock as opposed to a single global clock. It allows for a fine grain power management thereby saving power. Intel claims energy efficiency of 16 gigaflops/watt.

Intel is exploring important technologies and techniques with this. Among these are

networking, they refer to "network on a chip"
clocking
fine grain power management
memory stacking
memory BW

One point Semico discussed with Intel was that with this kind of test vehicle there is the potential to explore new SW schemes in particular partitioning and virtualization for flexible and dynamic MPU designs. That is, various sets of tiles can be clustered together conceptually to perform specific functions in parallel with other such clusters performing other functions. These clusters could be adjusted in size or reconfigured dynamically depending on the application requirements.

Intel confirmed that this is precisely one of the areas they are exploring. Some of the functions or tasks that such a scheme could address are clusters of tiles performing speech recognition, vision, graphics, etc.

Intel did not discuss any road map for the commercialization of this technology nor any possible end product architecture.

Semico Spin

A Glimpse of the Future: The Continuing Evolution of the 80-86

For a long time the main focus for Intel and AMD was to keep increasing frequency in order to reach new performance levels. When it was clear they were hitting the thermal wall, both companies focused on increasing levels of internal parallelism and multi-core designs.

Along with AMD's announcement of Fusion and this recent development by Intel, we are seeing a deconstruction of the 80-86. Does anyone recall the RISC vs CISC debate? The 80-86 still looks like a CISC architecture to the SW developers, but internally it is becoming more and more of a RISC machine.

Intel and AMD are each taking a different approach. Intel did not actually say that this test chip will be the basis for a future generation 80Ãƒ-86. However, there is the possibility that future MPU designs will not be multiple complex cores, but smaller less complex cores in a flexible, reconfigurable topology. AMD's Fusion will be an integration of AMD's MPU technology with the graphics technology acquired from ATI. However, this is not a case of "bolting" a GPU core next to a CPU core. The MPU and GPU elements are broken down into smaller units. AMD refers to this as a flexible, modular design approach.

As consumers we had to wean ourselves off of the frequency metric for PCs and start thinking about efficiency. We have dual core MPUs and the single-chip quad-cores are on the horizon. The ultimate goal is to increase data throughput rate and keep the power consumption in check. Future MPU innovations from Intel and AMD will require new metrics for the consumer.