IMS: Breakthroughs in Molecular Science

Institute for Molecular Science adopted new HPC system to support massively parallel operations and high-speed computations.

Executive Summary
The Institute for Molecular Science (IMS) significantly expanded its computing capabilities with a dual-purpose system designed to serve researchers that need high-performance parallel computing and memory-demanding serial processing. The new system was built on Intel® Xeon® Gold 6148 processors and Intel® Xeon® Gold 6154 processors with 800 GB Intel® SSD DC 3520 Series solid-state drives all interconnected by the Intel® Omni-Path Architecture (Intel® OPA).

Challenge
Japan’s Institute for Molecular Science (IMS) is a center for advanced research in the molecular sciences—both theoretical and experimental. IMS hosts four research departments: Theoretical and Computational Molecular Science, Photo- Molecular Science, Materials Molecular Science, and Life and Coordination- Complex Molecular Science. The organization provides a place of joint-research for the molecular science community, and it exchanges researchers through domestic and international relationships. IMS scientists also work collaboratively with a wide range of investigators across Japan and around the world, supporting breakthroughs in molecular science knowledge. IMS supercomputers have been used for important work in quantum chemistry calculations, band calculations, and molecular dynamics simulations. Recent work has appeared in scientific journals, including Nature (25 February 2016, vol. 530, pp. 465–468).

“The biggest challenge for real breakthroughs comes from the huge number of trial-and-error calculations that researchers have to run on our supercomputers to reveal novel structures and behaviors.” —Shinji Saito, director of Research Center for Computational Science (RCCS) at IMS

While molecular dynamics (MD) simulations are typically highly optimized for parallel computing, many quantum chemistry (QC) algorithms tend to run in serial fashion. In both types of computing, the large problems scientists need to study lead to long run times to gather the data they need to further their work. IMS provides enough CPU time for researchers to tackle such challenges, irrespective of the type of computing they need (serial or parallel).

“Our previous supercomputers were installed in 2011,” commented Saito. “They were running on six-year-old technologies. The numbers of cores and the speed of calculations were not enough for our users today.”

Solution
MD calculation can use thousands of cores at a time. More cores with a non-blocking interconnect allow researchers to run their jobs much faster, or run much larger jobs, compared to systems with fewer cores. But the serial processes of QC calculations require massive amounts of memory with the fastest CPU clock speeds to achieve results quickly.

“Since IMS supports research in both types of computational domains, and since CPU core speeds typically are lower with more cores, we needed a solution that offered both configurations—a system with thousands of cores and one with fewer, faster cores and large memory.” —Fumiyasu Mizutani, section chief of RCCS

IMS worked with NEC* to install two clusters with Supermicro* servers interconnected by Intel® Omni-Path Architecture (Intel® OPA). The new machine is called the High Performance Molecular Simulator. It placed 70 on the November 2017 Top500 list with 1.8 petaFLOPS LINPACK* and 3.1 petaFLOPS theoretical peak performance.1 It went into production at IMS on October 1, 2017.

The Molecular Simulator’s two systems run on Intel® Xeon® Gold 6148 processors with 20 cores for MD’s massively parallel computations, while the Intel® Xeon® Gold 6154 processors with 18 cores running at 3.0 to 3.7 GHz (Turbo) deliver the speed necessary for QC’s more demanding serial operations. To meet the requirements of different types of workloads, the 20-core nodes were configured in a full bi-sectional bandwidth (FBB) topology, while the faster nodes were 1:3 oversubscribed, considering they would not be communicating as much while running their memory-demanding jobs.

The Molecular Simulator also uses 800 GB Intel® SSD DC 3520 Series solid-state drives.

Results
Since the Molecular Simulator went into production, it has run many benchmarks using quantum chemistry calculations, molecular dynamics simulation, memory transfer, and disk performance programs. Additionally, users have begun running their research on the new system. A benchmark of a modified Test397, which is the geometry optimization and frequency calculation, with Gaussian09 Rev.d01 on the new system is approximately 2.1 times faster than that on the old system.2 The new system, with 40,588 cores, provides 7.3X the computational capacity of IMS’ previous system.2

“While these Gaussian benchmark results of this memory intensive workload were calculated prior to applying any ‘Spectre’ and ‘Meltdown’ software mitigations and firmware updates,” Mizutani noted, “further testing of the code indicated no impact to performance after the security updates were applied.”

Now, approximately 1000 jobs using one to 1000 cores by 80 active users are running on the new system constantly and efficiently.

Solution Summary
IMS supports a wide range of molecular science research, including computational research, using its new High Performance Molecular Simulator. The new system provides high performance computing (HPC) for both massively parallel operations and high-speed, memory-demanding serial computations. It integrates 40,588 cores of both Intel® Xeon® Gold 6154 processors and Intel® Xeon® Gold 6148 processors interconnected by the Intel® Omni-Path Architecture (Intel® OPA). The system placed 70 in the November 2017 Top500 list.

Solution Configuration

  • 40,588 cores of Intel® Xeon® Gold 6148 processors and Intel® Xeon® Gold 6154 processors
  • Intel® Omni-Path Architecture (Intel® OPA) fabric
  • Intel® SSD DC 3520 Series drives
  • 216,768 GB memory

Explore Related Intel® Products

Intel® Xeon® Scalable Processors

Drive actionable insight, count on hardware-based security, and deploy dynamic service delivery with Intel® Xeon® Scalable processors.

Learn more

Intel® Omni-Path Architecture (Intel® OPA)

Intel® Omni-Path Architecture (Intel® OPA) lowers system TCO while providing reliability, high performance, and extreme scalability.

Learn more

Intel® Optane™ DC SSDs

Intel® Optane™ technology is the first major memory and storage breakthrough in 25 years.

Learn more

Notices and Disclaimers

Intel® technologies’ features and benefits depend on system configuration and may require enabled hardware, software or service activation. Performance varies depending on system configuration. No computer system can be absolutely secure. Check with your system manufacturer or retailer or learn more at www.intel.sg. // Software and workloads used in performance tests may have been optimized for performance only on Intel® microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products. For more complete information visit www.intel.sg/benchmarks. // Performance results are based on testing as of the date set forth in the configurations and may not reflect all publicly available security updates. See configuration disclosure for details. No product or component can be absolutely secure. // Cost reduction scenarios described are intended as examples of how a given Intel®-based product, in the specified circumstances and configurations, may affect future costs and provide cost savings. Circumstances will vary. Intel does not guarantee any costs or cost reduction. // Intel does not control or audit third-party benchmark data or the web sites referenced in this document. You should visit the referenced web site and confirm whether referenced data are accurate. // In some test cases, results have been estimated or simulated using internal Intel analysis or architecture simulation or modeling, and provided to you for informational purposes. Any differences in your system hardware, software or configuration may affect your actual performance.

Product and Performance Information

1

NEC LX Cluster, Intel® Xeon® Gold 6148/6154 processor, Intel® Omni-Path Architecture (Intel® OPA) with 40,558 cores and a 3.1 petaFLOPS theoretical peak performance.

2

Fujitsu PRIMERGY* CX250 & RX300, Intel® Xeon® E5-2690/E5-2697v3 processor 2.9GHz/2.6Ghz, InfiniBand FDR/QDR with 12,992 cores and a theoretical performance of .437427 petaFLOPS per https://www.top500.org/site/48473. Performance results are based on testing as of (08/06/2018) and may not reflect all publicly available security updates.