HomeCoverTECH NEWSReminiscence Chips That Compute Will Speed up AI

Reminiscence Chips That Compute Will Speed up AI

John von Neumann’s authentic pc structure, the place logic and reminiscence are separate domains, has had run. However some firms are betting that it’s time for a change.

Lately, the shift towards extra parallel processing and a large improve within the dimension of neural networks imply processors have to entry extra knowledge from reminiscence extra shortly. And but “the efficiency hole between DRAM and processor is wider than ever,” says Joungho Kim, an skilled in 3D reminiscence chips at Korea Superior Institute of Science and Know-how, in Daejeon, and an IEEE Fellow. The von Neumann structure has grow to be the von Neumann bottleneck.

What if, as an alternative, not less than a few of the processing occurred within the reminiscence? Much less knowledge must transfer between chips, and also you’d save power, too. It’s not a brand new thought. However its second might lastly have arrived. Final 12 months, Samsung, the world’s largest maker of dynamic random-access reminiscence (DRAM), began rolling out processing-in-memory (PIM) tech. Its first PIM providing, unveiled in February 2021, built-in AI-focused compute cores inside its
Aquabolt-XL high-bandwidth reminiscence. HBM is the form of specialised DRAM that surrounds some prime AI accelerator chips. The brand new reminiscence is designed to behave as a “drop-in substitute” for strange HBM chips, mentioned Nam Sung Kim, an IEEE Fellow, who was then senior vice chairman of Samsung’s reminiscence enterprise unit.

Final August, Samsung revealed outcomes from exams in a accomplice’s system. When used with the
Xilinx Virtex Ultrascale + (Alveo) AI accelerator, the PIM tech delivered a virtually 2.5-fold efficiency achieve and a 62 p.c minimize in power consumption for a speech-recognition neural web. Samsung has been offering samples of the know-how built-in into the present era of high-bandwidth DRAM, HBM2. It’s additionally creating PIM for the subsequent era, HBM3, and for the low-power DRAM utilized in cellular gadgets. It expects to finish the usual for the latter with JEDEC within the first half of 2022.

There are loads of the way so as to add computational smarts to reminiscence chips. Samsung selected a design that’s quick and easy. HBM consists of a stack of DRAM chips linked vertically by interconnects known as through-silicon vias (TSVs). The stack of reminiscence chips sits atop a logic chip that acts because the interface to the processor.

The very best knowledge bandwidth within the stack lies inside every chip, adopted by the TSVs, and at last the connections to the processor. So Samsung selected to place the processing on the DRAM chips to make the most of the excessive bandwidth there. The compute items are designed to do the commonest neural-network calculation, known as multiply and accumulate, and little else.
Different designs have put the AI logic on the interface chip or used extra complicated processing cores.

Samsung’s two largest rivals,
SK hynix and Micron Know-how, aren’t fairly able to make the leap on PIM for HBM, although they’ve every made strikes towards different sorts of processing-in-memory.

Icheon, South Korea–primarily based SK hynix, the No. 2 DRAM provider, is exploring PIM from a number of angles, says
Il Park, vice chairman and head of memory-solution product growth. For now it’s pursuing PIM in customary DRAM chips moderately than HBM, which may be less complicated for patrons to undertake, says Park.

HBM PIM is extra of a mid- to long-term risk, for SK hynix. In the meanwhile, prospects are already coping with sufficient points as they attempt to transfer HBM DRAM bodily nearer to processors. “Many consultants on this area don’t wish to add extra, and fairly vital, complexity on prime of the already busy state of affairs involving HBM,” says Park.

That mentioned, SK hynix researchers labored with Purdue College pc scientists on a complete design of an
HBM-PIM product known as Newton in 2019. Like Samsung’s Aquabolt-XL, it locations multiply-and-accumulate items within the reminiscence banks to make the most of the excessive bandwidth throughout the dies themselves.

“Samsung has put a stake within the floor,” —Bob O’Donnell, chief analyst at
Technalysis Analysis

In the meantime, Rambus, primarily based in San Jose, Calif., was motivated to discover PIM due to power-consumption points, says Rambus fellow and distinguished inventor
Steven Woo. The corporate designs the interfaces between processors and reminiscence, and two-thirds of the facility consumed by system-on-chip and its HBM reminiscence go to transporting knowledge horizontally between the 2 chips. Transporting knowledge vertically throughout the HBM makes use of a lot much less power as a result of the distances are a lot shorter. “You may be going 10 to fifteen millimeters horizontally to get knowledge again to an SoC,” says Woo. “However vertically you’re speaking on the order of a pair hundred microns.”

Rambus’s experimental PIM design provides an additional layer of silicon on the prime of the HBM stack to do AI computation. To keep away from the potential bandwidth bottleneck of the HBM’s central through-silicon vias, the design provides TSVs to attach the reminiscence banks with the AI layer. Having a devoted AI layer in every reminiscence chip may enable reminiscence makers to customise reminiscences for various functions, argues Woo.

How shortly PIM is adopted will rely on how determined the makers of AI accelerators are for the memory-bandwidth reduction it offers. “Samsung has put a stake within the floor,” says Bob O’Donnell, chief analyst at
Technalysis Analysis. “It stays to be seen whether or not [PIM] turns into a business success.

This text seems within the January 2022 print challenge as “AI Computing Involves Reminiscence Chips.”



Please enter your comment!
Please enter your name here

- Advertisment -

Most Popular