AMD CEO Lisa Su unveiled the company’s latest chip, the Instinct MI300X, at an invite-only event in San Francisco on Tuesday. The chip is a key part of AMD’s strategy for artificial intelligence computing, offering enormous memory and data throughput for generative AI tasks, such as large language models. The MI300X is a follow-on to the previously announced MI300A, and features multiple GPU “chiplets,” individual chips that are joined together in a single package by shared memory and networking links. The MI300X is a “GPU-only” version of the part, with CPUs swapped out for two additional CDNA 3 chiplets. The chip offers 2.4 times the memory density of Nvidia’s H100 “Hopper” GPU, and 1.6 times the memory bandwidth.
The MI300X has a transistor count of 153 billion, shared DRAM memory of 192 gigabytes, and a memory bandwidth of 5.2 terabytes per second. It is the only chip that can handle large language models of up to 80 billion parameters in memory, according to AMD. Su demonstrated the MI300X creating a poem about San Francisco using the open source Falcon-40B, the most popular large language model at present. The MI300X is the first chip that is powerful enough to run a neural network of that size entirely in memory, rather than having to move data back-and-forth to and from external memory.
Su said the MI300X is a “generative AI accelerator” designed specifically for AI and high-performance computing workloads. The use of chiplets in this product is very strategic, Su said, because of the ability to mix and match different kinds of compute, swapping out CPU or GPU. The MI300X will offer more memory density and memory bandwidth than the competition, which will reduce the number of GPUs needed, significantly speeding up performance, especially for inference, as well as reducing the total cost of ownership.
To compete with Nvidia’s DGX systems, Su unveiled the AMD Instinct Platform, a family of AI computers. The first instance of that will combine eight of the MI300X with 1.5 terabytes of HMB3 memory. The server conforms to the industry standard Open Compute Platform spec. Unlike MI300X, which is only a GPU, the existing MI300A is going up against Nvidia’s Grace Hopper combo chip, which uses Nvidia’s Grace CPU and its Hopper GPU. The MI300A is being built into the El Capitan supercomputer under construction at the Department of Energy’s Lawrence Livermore National Laboratories.
The MI300A is being shown as a sample currently to AMD customers, and the MI300X will begin sampling to customers in the third quarter of this year. Both will be in volume production in the fourth quarter, according to Su. “The generative AI, large language models have changed the landscape,” said Su. “The need for more compute is growing exponentially, whether you’re talking about training or about inference.” The MI300X offers more memory, more memory bandwidth, and fewer GPUs needed, which means more inference jobs per GPU than before, reducing the total cost of ownership for large language models.
In conclusion, the Instinct MI300X is a powerful chip that is designed specifically for AI and high-performance computing workloads. It is the only chip that can handle large language models of up to 80 billion parameters in memory, offering more memory density and memory bandwidth than the competition. The chip is a centerpiece in AMD’s strategy for artificial intelligence computing, and the company hopes it will compete with Nvidia’s DGX systems. The MI300X and MI300A will be available for customers in the third and fourth quarters of this year, respectively.