v1 Infrastructure

Under Phase I, the Software Engineering and Coupler Group was responsible for the source code repository, the case management or model configuration system, the build system, the testing process and coverage, the maintenance of the code on development and production platforms, the project collaboration tools, the coupling and parallel IO libraries, and the software engineering design of the code. Unlike other groups in Phase I, this effort had no counterpart in a previous DOE climate project and was created from scratch.

The group focused on the following areas:

git/Github

We decided to use the git version control tool and host the E3SM source code repositories on the github site. We focused the project on the fully-coupled model, keeping almost the entire code base in a single repository while still allowing standalone component model simulations. Use of the gitworkflow development workflow (with topic branches, an integration branch for testing, and a robust master branch) has provided a robust code development process.

CIME Infrastructure

We partnered with the CESM Software Engineering Group (CSEG) to develop a Python-based toolset for both models called the Case Control System (CCS), which is brought into both models as part of the Common Infrastructure for Modeling the Earth (CIME) component. CIME also includes the coupler, the data models and utility libraries used by CESM and E3SM. We completed this full refactoring of the CCS in CIME v5 which largely retains the same command-line interfaces as the old CESM scripts but is far more robust. The new version of CIME is significantly improved in testability, readability, maintainability, and extensibility.

Testing

A portion of our web-based dashboard showing latest results from several E3SM test suites on several platforms.

The only scalable way to perform code development with a large, disperse team is with automated testing. We now have a test suite with 60 system tests that run nightly (or bi-weekly) on 11 separate machines, including all large DOE machines (LCFs and NERSC) where we do production runs. We automatically test both our “next” branch of candidate code changes as well as our trusted “master” branch, and post the results to a CDash dashboard (See figure for sample output). Our goal was to have full testing occur overnight on all key platforms, so we could integrate one set of code changes with confidence each day. To achieve this with acceptable code coverage, we introduced scientifically-useless ultra-low-resolution models that could nonetheless detect numerical changes in the code or loss of restart functionality. Under the associated CMDV-SM project, we are making progress on some targeted unit testing and in Solution Reproducibility test that can distinguish between round-off level and climate changing code/environment changes.

Collaboration tools

The SE/CPL group selected the collaboration and project management tools to manage our >100 team members over numerous locations. Developing collaboration pages in Confluence has been extremely successful. We have pages for discussion, meeting notes, decision logs, and documentation. The Slack messaging application is heavily used by the SE Group and developers from other E3SM Ecosystem projects for discussions and coordination that do not need to be archived. For discussions related to the code base, such as bug fixes, enhancements, and refactoring, we use the github Issues system, with 1800 issues created combined on our E3SM and CIME repositories. The github Pull Request, for discussions on code that a developer is requesting a review before being pulled into the main repository has seen 2600 uses between these two repositories. . JIRA software is used for project management.

Coupler/Driver

The source code that couples the different components together and drives the outer time step forward is under the E3SM SE/CPL Group. It was co-developed with CSEG as part of CIME. Our contributions were to perform maintenance and minor extensions to the driver and coupler code. We updated the underlying Model Coupling Toolkit (MCT) library to a newer version incorporating improvements from the performance group. Resources to write a new coupler based on the Mesh Oriented datABase (MOAB) library  were allocated under the CMDV-SM project and will deliver an improved coupler in late 2018 for use in v2 simulations.

PIO

Developments to mature version 2 of the Parallel I/O library to replace PIO1 have been underway, and we are getting close to getting all tests to pass when using this re-written version.

Send this to a friend