AMD Instinct MI100 & ; the first gas pedal based on the 7nm CDNA architecture, which, unlike RDNA, is computationally oriented rather than graphics-oriented, although it does save some components for rendering. However, the paths of RDNA and CDNA have finally separated, and the new gas pedal is designed exclusively for high-performance computing and AI. The first of the MI100 series has 120 CU, which contains new blocks for matrix operations that are important in AI workloads. But they work without compromising on « classic » computing; and the FP64 peak performance is 11.5 Tflops, while for the FP32 it is exactly twice that of 23 Tflops. These numbers are higher than the NVIDIA A100, and AMD insists that this is the performance gain needed to achieve the coveted performance in a single exascale.
AMD Instinct MI100
However, at the other end of the spectrum, in bfloat16-computing, a novelty from AMD loses & ; 92.3 Tflops vs. 312 Tflops on Tensor Core. Other given performance values for other calculation accuracy indicators vary. In addition, the PCIe-version of the A100 can be slightly slower than the SXM version due to lower power consumption on real tasks. And the Instinct MI100, so far, is only available in the form factor of a full-size PCIe card with a consumption of 300W. The card has 32GB of HBM2 memory with a bandwidth of 1.23 TB/s, which is slightly less than the PCIe version of NVIDIA A100: 40GB of HBM2e and 1.555 TB/s respectively. Both cards have the main PCIe 4.0 x16 (64 Gbytes/sec) interface and an additional bus for direct communication between the gas pedals. In the case of NVIDIA, this is NVLink (600 Gbytes/sec), which for the PCIe version is limited to only two cards, and in the case of AMD & ; this is Infinity Fabric (IF). MI100 has three IF interfaces with a bandwidth of 92 GB/s (276 GB/s in total), which allows you to combine up to four gas pedals that can communicate with each other on a circuit. It doesn't depend on whether PCIe 3.0 or 4.0 interface is connected to the host. Naturally, the best option for the system as a whole will be a bunch of AMD EPYC and the new MI100. The main trump card of AMD, as it often happened before & ; this is the cost of new products. The company does not quote exact prices, but says that the performance per dollar is 1.8-2.1 times better than the NVIDIA A100. Among the first systems to be validated for new gas pedals are Dell PowerEdge R7525, Gigabyte G482-Z54, HPE Apollo 6500 Gen10 Plus, Supermicro AS-4124GS-TNR. Selected partners have already received new gas pedals and systems based on them for performance evaluation and software adaptation. Along with the release of Instinct MI100 AMD introduced a new major release of ROCm 4.0, the open software platform for HPC and AI. AMD highlighted the productivity growth, ease of use and readiness of many software solutions to work with the new release and the new «iron»... And most importantly & ; ease of porting code to the new platform, primarily with NVIDIA CUDA. For some developers it took literally from a few hours to one day, or up to several weeks in more complex cases. The new hardware and software platform based on AMD EPYC, Instinct M100 and ROCm 4.0 will form the basis of the upcoming Frontier and Pawsey supercomputers. And whether the new machines with MI100 will be in the latest TOP500 ranking we'll know tomorrow. The competition will be for the new NVIDIA A100 gas pedals with twice the amount of HBM2e memory.