Wednesday, February 14, 2007

Fastest On-Chip Dynamic Memory Technology from IBM

Almost as an answer to Monday's announcement of Teraflops-chip by Intel, today IBM announced at the International Solid State Circuits Conference in San Francisco that it has devised a way to triple the amount of memory stored on chips and double the performance of data-hungry processors by replacing a problematic type of memory with a variety that uses much less space on the slice of silicon.

IBM's new technology, which is also designed in stress-enabled 65nm Silicon-on-Insulator (SOI) using deep trench, improves on-processor memory performance in about one-third the space with one-fifth the standby power of conventional SRAM (static random access memory). The new memory technology will help unclog crippling bottlenecks that build up as increasingly powerful microprocessors attempt to retrieve data from a separate memory chip faster than it can be delivered.

IBM said its solution entails swapping out most of the static random access memory, or SRAM, used to store information directly on computer chips and integrating onto the chip another kind of memory, known as dynamic random access memory, or DRAM. SRAM is a type of memory that’s fast and easy to manufacture but takes up a lot of valuable 'space' on the chips. DRAM, the most common type of memory used in personal computers, has typically been stored on a separate chip and has previously been viewed as too slow to be integrated directly onto the microprocessor.

The company has been able to speed up the DRAM to the point where it’s nearly as fast as SRAM, and that the result is a type of memory known as embedded DRAM, or eDRAM, that helps boost the performance of chips with multiple core calculating engines and is particularly suited for enabling the movement of graphics in gaming and other multimedia applications.

The prototype 500 MHz chip created by IBM was able to move data in and out of the memory in a random fashion in under 1.5 nanoseconds, which is a bit slower than SRAM--half as fast compared to the fastest SRAM on the market, in fact. But the embedded DRAM was considerably faster at dispensing data than normal DRAM, which can take almost 10 times as long to do the same task. Because DRAM has much fewer transistors and less memory leakage, IBM says it can cram 3 times as much DRAM cache on the chip, which will make up for some of the slowness of embedded DRAM compared to SRAM.

The technology is expected to be a key feature of IBM's 45nm microprocessor roadmap and will be included in IBM's server chips starting in 2008 and will expand to other products.




Monday, February 12, 2007

Intel's Teraflops-Capable Chip

Intel has designed an experimental computer chip with 80 separate processing engines (or cores), that promises to perform calculations as quickly as an entire data center - while consuming as much energy as a light bulb. The world's biggest chipmaker announced today that it developed a programmable processor that can perform about a trillion calculations per second, or deliver a performance of 1.01 teraflops. It accomplishes this feat while consuming 62 watts of power when the chip is running at a frequency of 3.16 gigahertz.

"ASCI Red", the first computer to benchmark at Teraflops in 1996, took up more than 600 square metres at Sandia National Laboratories and used nearly 10,000 Pentium Pro processors running at 200MHz and consuming 500kW of power. It required an additional 500 kW just to keep the room cool.

Intel's latest chip is still in the research phase, but it marks an important breakthrough for an industry obsessed with obtaining the highest amount of performance for the lowest energy consumption. The chip’s design is meant to exploit a new generation of manufacturing technology the company introduced last month (our past posting). Intel said that it had changed the basic design of transistors in such a way that it would be able to continue to shrink them to smaller sizes — offering lower power and higher speeds — for at least a half-decade or more.

Technology experts praised Intel for devising a clever way to get 80 core calculating engines onto a single slice of silicon. The cores used on the research chip are much smaller and simpler than those used in Intel's latest line of chips, which have two or four cores. The research chip has 100 million transistors on it, about one-third the number on Intel's current line of chips.

The first uses for the chips would likely be in corporate data centers, supercomputers, communications infrastructures and for heavy-duty financial and scientific research. Intel suggested one possible consumer use: a program that intelligently monitors a televised sporting event and automatically identifies and compiles key highlights like a slam dunk or a home run by a favourite player based on the spectator's preferences. Other uses could be artificial intelligence, realistic 3-D computer modeling and real-time speech recognition.

Already, computer networking companies and the makers of PC graphics cards are moving to processor designs that have hundreds of computing engines, but only for special applications. For example, Cisco Systems now uses a chip called Metro with 192 cores in its high-end network routers. Last November Nvidia introduced its most powerful graphics processor, the GeForce 8800, which has 128 cores.

The Intel demonstration suggests that the technology may come to dominate mainstream computing in the future. While the chip is not compatible with Intel’s current chips, the company said it had already begun designing a commercial version that would essentially have dozens or even hundreds of Intel-compatible microprocessors laid out in a tiled pattern on a single chip.