Nvidia Dominates MLPerf Benchmark Tests, Intel’s Habana Provides Competition
In the latest benchmark tests conducted by MLCommons, an industry consortium, Nvidia has once again emerged as the leader in chip speed for training neural networks. With competitors such as Google, Graphcore, and Advanced Micro Devices not participating in this round of testing, Nvidia’s dominance across all eight tests was uncontested. However, Intel’s Habana business did provide meaningful competition with its Guadi2 chip, and the company has pledged to surpass Nvidia’s top-of-the-line H100 GPU by this fall.
The benchmark test, Training version 3.0, measures the time it takes to tune the neural “weights” or parameters of a computer program until it achieves the required minimum accuracy on a given task. This process is known as “training” a neural network. The main Training 3.0 test consists of eight separate tasks that record the time taken to refine the settings of a neural network through multiple experiments. The other half of neural network performance, known as inference, where the network makes predictions based on new data, is covered in separate releases from MLCommons.
In addition to training on server computers, MLCommons also released a companion benchmark test, MLPerf Tiny version 1.1, which measures performance when making predictions on low-powered devices.
Nvidia secured the top spot in all eight tests, achieving the fastest training times. Two new tasks were introduced in this round of testing. One of them involved testing the GPT-3 large language model (LLM) developed by OpenAI. Nvidia, in collaboration with partner CoreWeave, assembled a system that took just under eleven minutes to train using the Colossal Cleaned Common Crawl dataset. The system utilized 896 Intel Xeon processors and 3,584 Nvidia H100 GPUs, employing Nvidia’s NeMO framework for generative AI. The training was conducted on a portion of the full GPT-3 training, using the “large” version with 175 billion parameters.
Another addition to the benchmark tests was an expanded version of recommender engines, which are commonly used for product search and social media recommendations. MLCommons upgraded the training dataset to a four-terabyte Criteo 4TB multi-hot dataset, replacing the previous one-terabyte dataset.
Intel’s Habana was the only vendor to compete against Nvidia, submitting five entries with its Gaudi2 accelerator chip. Additionally, computer maker SuperMicro submitted one entry using Habana’s chip. However, in every case, the Habana systems fell significantly behind the top-performing Nvidia systems. Intel’s Jordan Plawner, head of AI products, acknowledged that the time difference between Habana and Nvidia may be negligible for many companies using comparable systems. Plawner highlighted the price advantage of Habana’s Gaudi2, which offers more training per dollar compared to similarly spec’d Nvidia A100 GPUs.
Plawner also noted that Nvidia used a data format called “FP-8” (floating point, 8-bit) for their MLPerf entries, while Habana utilized an alternate approach called “BF-16” (B-float, 16-bit). The higher arithmetic precision of BF-16 slightly hampers training time. However, Plawner stated that Gaudi2 will transition to FP-8 later this year, allowing for greater performance and potentially surpassing Nvidia’s H100 system.
Plawner emphasized the need for an alternative to Nvidia in the industry. Customers, frustrated by the limited supply of Nvidia’s parts, are now considering switching to other options. Intel, being the world’s second-largest chip manufacturer, has more control over its supply chain, giving it an advantage in meeting customer demands.
In conclusion, Nvidia’s dominance in chip speed for training neural networks was once again demonstrated in the MLPerf benchmark tests. While Intel’s Habana provided meaningful competition, Nvidia emerged as the clear leader across all eight tests. However, Intel aims to surpass Nvidia’s performance with its upcoming Gaudi2 chip. The industry is eager for alternatives to Nvidia, especially considering the current supply constraints faced by the company.