Compy – PNNL

E3SM Dedicated Machine - Compy

compy computerThe system funded by the Department of Energy (DOE) Office of Biological and Environmental Research (BER) is housed in PNNL’s Computational Science Facility.

Compy contains 460 nodes of dual socket Intel Xeon Skylake CPUs with 192 GB of DDR4 DRAM and 40 cores per node, with a high-speed OmniPath interconnect and 1 PB of disk storage. The system will provide 160 M core-hours per year, representing a significant increase in the computing power available to the BER-E3SM community. Due to the state-of-the-art CPUs, researchers expect this new machine will be one of the fastest available for running the moderate-resolution E3SM v1 model.

Allocations

The machine is designed to support the E3SM project, as well as related BER-funded projects working with the E3SM model.

Machine Allocation

The machine is allocated using a fare-share strategy with the allocation based on how the purchase funds were allocated among projects at BER:

  • E3SM:   50%
  • RGMA:  35%
  • ESMD other projects, including SciDAC:  15%

E3SM is for work funded directly by the E3SM SFA project tracked in E3SM roadmaps/Jira epics. RGMA is for work funded by the DOE-BER RGMA program office. ESMD is for work funded by the DOE-BER ESMD program office which is not part of the E3SM SFA.

Fair-share Queueing

Compy uses SLURM fair-share to allocate resources for these accounts based on the percentages mentioned above. SLURM fairshare setup works on a sliding 7 day window.

With a fair-share queueing, there are no fixed allocations but instead priority of each job is based on the current % utilization of the project, with under-utilized projects getting higher priority than over-utilized projects. There will also be a premium que for occasional high priority jobs, after approvals from the CRC.

Runtime Limits

The following runtime limits will be enforced by the queuing system, starting the beginning of November 2019:

  • 1-100 nodes    36h queue limit
  • 101-180            24h
  • 181→440         6h
  • limit of 2 running jobs for all users in the main “slurm” partition (the short queue partition allows up to 20 jobs)

Short Queue

  • A short queue (40 nodes) is available on Compy for debugging and for short runs. Submit jobs with ‘-q short’ option to submit jobs to this queue
  • Maximum number of jobs per user: 20 jobs
  • Maximum number of nodes per user: 40 nodes
  • Default time: 30 minutes
  • Max time: 2 hours
  • If you need to run a job that uses all nodes for the maximum time of the queue (2 hours), please run it outside normal working hours.

Usage Reports

Any user can use sreport to generate reports on machine usage by user and project.

 

Getting an Account

  • E3SM SFA
    • E3SM members, directly funded by the E3SM SFA project, can request an account on Compy and use E3SM project resources there.
    • The procedure to set up an account:
      • add your name to the compy users table (see Compy Users – internal page, for instructions).
    • The details about the machine and quick start for the E3SM users can be found on:

 

  • RGMA & ESMD
    • The process to set up account for people funded by RGMA & ESMD programs:
      • contact your project manager and request the RGMA account or ESMD account on compy
      • They will verify your eligibility and forward your request to compy admin support
      • Compy admin will contact you with a request for additional information before the account can be created

Fun Fact

The computer’s full name is “CompyMcNodeFace”, affectionately shortened to “Compy”. The name was conceived in the E3SM all-hands name voting competition and chosen by popular demand.

The specific name was a riff on a similar competition for a British research vessel, during which the proposed name Boaty McBoatFace went viral and won the competition. The name was vetoed for the ship itself (now the RRS Sir David Attenborough), but was assigned to the lead Autosub Autonomous Underwater Vehicle (AUV).

See https://en.wikipedia.org/wiki/Boaty_McBoatface for the full story and links to several related variants that have successfully won a number of other popular naming competitions. 

 

Send this to a friend