Processing In Memory — Garrett Technologies, Inc.

Accelerating AI

Despite how complex modern computers seem, their basic architecture hasn’t changed much since they started implementing what’s known as Von Neumann architecture. The real “brains” of computers lie in the CPU and the RAM, the latter of which stores program code and data. While modern RAM is amply fast for a wide range of tasks, the connection between the RAM and CPU can function as a bottleneck for certain tasks; in particular, many artificial intelligence implementations are restrained by how quickly the processor can access data stored in RAM. Recent developments, however, aim to mitigate this limitation, potentially leading to far more capable AI implementations and allowing developers to craft new means of harnessing the potential of AI.

An Imbalance in Performance

For a CPU, the overriding goal is to improve the speed at which data is processed. For RAM, on the other hand, the leading goal has been to increase the amount of data that can be stored at once, as storing more information in RAM is key toward improving performance. While CPU performance has increased dramatically over the years, RAM speed has improved only a modest amount, leading to CPUs sitting idle while data is retrieved from RAM. Although efforts to improve RAM speed have ramped up in recent years, it remains difficult to close the gap in performance, leading manufacturers to focus on innovative approaches instead of simply hoping RAM speeds will increase significantly.

Blurring the Lines

Even though the CPU and RAM are discrete parts with well-defined roles, CPUs do store information used for running tasks. While this type of storage is by far the fastest, it's also highly limited, with CPUs only capable of storing a tiny amount of information. Expanding this type of storage is a great way to improve performance but doing so at a large scale is impossible with modern implementations and can lead to compromises in other areas. Another long-theorized approach, however, is being developed—processing data within the RAM itself.

Advantages of Processing-in-Memory (PIM) Chips

Processing data within RAM is still a technology under development. However, Samsung has been testing its own technology, with tests showing performance improvements in some workloads increasing by a factor of about 2.5. Furthermore, these tests show a reduction in energy usage of 62 percent, an increase that can make AI processing easier to scale. While such tests are often tailored to niche use cases, Samsung’s benchmarks focused on speech-recognition, a major field of interest.

Where Does PIM Excel?

Processing within RAM is limited, and it won’t serve as a common replacement for typical RAM and CPU interaction for the foreseeable future. However, much of the CPU’s role entails performing simple calculations, and the RAM speed bottleneck serves as the limiting factor. In areas where advanced CPU processing provides few advantages, offloading some work to the RAM can allow the CPU to perform more challenging roles. Machine learning, in general, falls into this category, although the benefits of PIM depend on the specific workloads. Speech recognition seems to fall in the sweet spot where in-memory processing can provide clear advantages.

Who’s Leading the Way?

Samsung is the clear leader in PIM technology. Its Aquabolt-XL high-bandwidth memory, which rolled out in February of 2021, is the first PIM-enabled memory to hit the market. Samsung is the world’s top producer of DRAM, but its competitors aren’t resting. SK hynix, its closest competitor, developed PIM for its HBM in 2019 but instead decided to focus on bringing the technology to its standard DRAM. Micron Technology, the third-leading manufacturer of DRAM, acquired an AI startup named Fwdnxt in 2019. While its plans aren’t yet clear, it has stated a goal of bringing processing and memory closer together, a position that heavily suggests some form of PIM in the future. There are also smaller companies looking to develop PIM or related technologies, such as Rambus and the startup NeuroBlade.

When Will Consumer PIM Arrive?

RAM speed is more than sufficient for the vast majority of consumer use cases. However, being able to process data within memory is a broadly applicable advantage, and, if it can be done affordably, it’s possible that consumer devices will begin taking advantage of this technology. Finding out how to best implement PIM technology within operating systems or individual software programs will likely remain elusive for some time. Samsung is leading the charge to standardize PIM implementations, but operating system developers are unlikely to fully jump on board until the technology has proven itself viable in the consumer sphere. Performance increases for consumer devices are always welcome, but it’s likely that PIM’s ability to improve efficiency will be the driving factor to support its adoption.

Is PIM the Future?

There are both theoretical and proven advantages to PIM technology, yet its future still remains somewhat cloudy. Although some workloads can likely be improved significantly with the technology, the cost of PIM, along with the difficulties inherent to any change of technology, means it might not be the obvious choice for many companies. Speech recognition, for example, is an area where PIM excels, but is there enough demand to justify changing computing architecture at enough companies to make the technology economically viable? Furthermore, AI optimizations might mitigate the advantages of PIM; the more traditional approach might be enough to make PIM unnecessary for most use cases. Although predicting the future is impossible, it’s clear that PIM technology’s advantages mean it will have at least a niche role, and it may become a standard feature of memory modules in the years to come.

Von Neumann architecture has undoubtedly proven itself viable, and challengers to its supremacy have historically failed to unseat it. However, asymmetry in processor and RAM speeds has created a significant bottleneck in some workloads that won’t be easy to resolve in the near term. Although there are several potential ways to close this gap, PIM presents a straightforward approach that doesn’t completely eschew Von Neumann architecture, and it’s likely that AI will lead the charge in showing what this technology can do.