E3SM Computing Resources

  • October 10, 2018
  • Brief,Home Page Feature
  • Early on, the E3SM project identified a need for dedicated computing resources. This was especially important during the development process, when it was necessary to obtain overnight turnaround time for model debugging and for moderate-resolution atmosphere, ocean/ice and fully coupled simulations. It required near real-time access to a moderate number of nodes, a use case that is difficult to support at a large computing center with many users and which must run at greater than 90 percent utilization.

    First Dedicated System - “Anvil”

    E3SM’s Anvil computer is a 240 nodes cluster dedicated 24/7 to the E3SM project.

    In 2016, The E3SM project purchased its first dedicated system, “Anvil,” to support development work. This purchase was critical for the development of the E3SM v1 model, and in 2017 it was doubled in size. Anvil is a 240-node cluster, containing a dual-socket Intel Xeon Broadwell CPUs with 64 GB of DDR4 DRAM and 36 cores per node.  It uses an FDR InfiniBand network and has 250TB of disk storage.  Anvil is hosted by Argonne National Laboratory’s Computing Resource Center.

    New Upcoming System Dedicated to BER-Funded Projects

    Intel Skylake Xeon platform includes the Advanced Vector Extensions (AVX-512) that doubles the floating point performance compared to the previous generation
    [https://www.top500.org/news/intel-skylake-xeon-processors-debut-on-google-cloud/]

    In 2018, the Department of Energy (DOE) Office of Biological and Environmental Research (BER) is expanding its pool of computational resources with a new machine that will support the E3SM project, as well as related BER-funded projects working with the E3SM model.  This new machine is expected to be operational in late 2018. It will contain close to 500 nodes of dual socket Intel Xeon Skylake CPUs with 192 GB of DDR4 DRAM and 40 cores per node.  It will use a high-speed interconnect and have 1 PB of disk storage. It will be hosted at Pacific Northwest National Laboratory.

    The system will provide 160M core-hours per year, representing a significant increase in the computing power available to the BER-E3SM community.  Due to the state-of-the-art CPUs, it is expected that this new machine will be one of the fastest available for running the moderate-resolution E3SM v1 model. The machine will also be capable of running the E3SM v1 high-resolution configuration, although only the National Energy Research Scientific Computing Center (NERSC) and DOE Leadership Computing Facilities can provide the horsepower necessary to run long-simulation campaigns at high resolution.

    Send this to a friend