COMPY In Action

  • November 13, 2019
  • Brief,Home Page Feature
  • Compy - E3SM's dedicated machine at PNNL

    Compy, E3SM’s dedicated machine at PNNL, consists of 460 nodes of Intel Xeon Skylake CPUs.

    Compy is Fully Operational

     

    CompyMcNodeFace, affectionately called Compy (check the story behind the name at Compy – PNNL ) is now fully operational. In fact it is being used so heavily that it was necessary to implement runtime limits to the queuing system.

    Compy contains 460 nodes of dual socket Intel Xeon Skylake CPUs with 192 GB of DDR4 DRAM and 40 cores per node, with a high-speed OmniPath interconnect and 1 PB of disk storage.

    Machine’s Allocation

    The machine is allocated using a fare-share strategy with the allocation based on how the purchase funds were provided by the Office of Biological and Environmental Research (BER): the E3SM project, projects funded by the Regional & Global Model Analysis (RGMA) program, and other projects funded by the Earth System Model Development (ESMD) program.

    • E3SM:   50%
    • RGMA projects:  35%
    • ESMD other projects, including SciDAC:  15%

    Fair-share Queueing

    Compy uses SLURM fair-share to allocate resources for these accounts based on the percentages mentioned above. SLURM fairshare setup works on a sliding 7 day window.

    With a fair-share queueing strategy, there are no fixed allocations but instead the priority of each job is based on the current % utilization of the project, with under-utilized projects getting higher priority than over-utilized projects.   For the E3SM allocation, there will also be a premium queue for occasional high priority jobs as approved by E3SM group leads.

    Runtime Limits

    The following new runtime limits will be enforced by the queuing system, starting the beginning of November:

    • 1-100 nodes    36h queue limit
    • 101-180           24h
    • 181→440         6h
    • limit of 2 running jobs for all users in the main “slurm” partition (the short queue partition allows up to 20 jobs)

    Short Queue

    A short queue (40 nodes) is also available on Compy for debugging and for short runs

    • Maximum number of jobs per user: 20 jobs
    • Maximum number of nodes per user: 40 nodes
    • Default time: 30 minutes
    • Max time: 2 hours

    Getting Account

    For more details on Compy and for directions on getting an account on the machine, go to Compy – PNNL

    Related Articles

    Send this to a friend