All-AMD Powered & World’s First Exascale Supercomputer, Frontier, Has Been Running Into Issues Ever Since It Powered On

All-AMD Powered & World's First Exascale Supercomputer, Frontier, Has Been Running Into Issues Ever Since It Powered On 1

Oak Ridge National Laboratory or ORNL is the home to the Frontier supercomputer. Frontier is marked as the first exascale-level system created using AMD’s EPYC Trento CPUs & Instinct MI250X compute Accelerators. The entire system makes use of HPE’s Slingshot interconnects. It is also slated as the world’s fastest supercomputer available and is the world’s only operational Exascale design.

The Cray EX architecture by HPE was created for large-scale applications that researchers would be able to access to assist in scientific research starting in 2023. However, the supercomputer cannot run an entire day without several failures located within the hardware.

The ORNL Frontier boots up but can only produce a maximum of 1 FP64 ExaFLOPS, whereas the system was designed to deliver 1.685 FP64 ExaFLOPS. While no word has been given regarding the specific issues, a few rumors are coming to light.

First, the Slingshot interconnects, the network created for HPE Cray supercomputers, conflicts with the HPE clusters. Unfortunately, the specificity of the exact issue is unknown. Secondly, the AMD Instinct MI250X compute GPUs and the EPYC Trento CPUs are rumored to conflict with the Slingshot interconnects. Again, no official word come from the project leads or researchers of the ORNL Front supercomputer.

Mike Bernhardt, the Department of Energy’s (DOE) Exascale Computing Project, states that the full integration of ORNL Frontier will be available to researchers starting next year but is not quoted as having any concerns or issues with the full launch of the Frontier supercomputer.

ORNL’s partners in the exascale effort, HPE and AMD, have delivered the new Frontier system to ORNL ahead of the schedule for this fal. The installation and integration of Frontier, a massive, complex effort, is now underway, and the current progress indicates everything is on track to have Frontier available to users for open science next year — as anticipated.

Mike Bernhardt (Communication Lead for DOE’s Exascale Computing Project) via InsideHPC

The placement of Bernhardt stating “complex effort” could lead to why rumors abound concerning the project. It is also to note that AMD’s MI250X compute GPUs are only available to select customers, which is why there is a lack of benchmarks to back the rumored claims. The DOE has worked closely with Oak Ridge’s Leadership Computing Facility on Frontier. The ORNL Frontier supercomputer is slated to become fully operational by January 1, 2023, after missing an initial 2022 deadline.

News Sources: Inside HPC, Toms Hardware

#AllAMD #Powered #Worlds #Exascale #Supercomputer #Frontier #Running #Issues #Powered

Leave a Reply

Your email address will not be published. Required fields are marked *