E3SM “Swipes Right” on Machine Learning

  • May 15, 2025
  • Feature Story
  • There are many promising avenues for AI/ML to impact the E3SM mission. Shown here are examples from downscaling data (top) and using emulators to generate large ensembles (bottom).

    The Earth system model development enterprise in BER is enthusiastically exploring a potential long-term relationship with Machine Learning (ML) and Artificial Intelligence (AI).

    The motivation is clear. We are all seeing the potential of AI/Ml in our personal lives, whether generative AI like ChatGPT, facial recognition within our photo libraries, or addictive curation of our information feeds. Scientific advancements enabled by ML also catch the eye, from the protein structure prediction of AlphaFold, to the DOE highlight of using ML to help guide National Ignition Facility (NIF) experiments towards ignition. Closer to home, the weather forecast community has seen very impressive capabilities where emulators trained with historical data are competing with physics-based models.

     

    How is the E3SM community actively exploring ML?

    In preparation for the recent E3SM All-hands project meeting, the project solicited the community for ongoing ML efforts that are targeting the E3SM code base or workflow. This was not a rigorous survey – only E3SM project members were solicited, and delineations of what constitutes “ML” and “targeting E3SM” were not adjudicated – but the responses were still very enlightening.

    Altogether, 41 distinct efforts of ML targeting E3SM were reported, with involvement from all eight E3SM labs. Table 1 shows the results of the survey.

    The table shows the 41 report AI//ML efforts

    Table 1. The table shows the 41 report AI//ML efforts targeting E3SM, with brief titles of the scientific work, the technical point-of-contact, the targeted E3SM component, and the project or program funding the effort. The final column classifies the AI/ML Usage, with shorthand names that refer the five pillars of the E3SM AI/ML Strategy which will be presented below. Note: this table is all of the efforts E3SM staff reported in the survey; efforts not listed here, but which could be part of an E3SM release, are invited to contact Andy Salinger (agsalin@sandia.gov) to be added to the list.

    Examination of the table leads to some interesting findings:

    • Funding source: LDRD (Lab Directed Research and Development) at 6 labs (13); E3SM Project (11); The Office of Science (SC) SciDAC program (6); other BER EESM (Earth and Environmental Science Modeling) projects (5); Other DOE SC such as Early Career awards, ASCR (Advanced Scientific Computing Research), and FAIR (6)
    • Targeted Component: Atmosphere (21); Land (9); Land Ice (3); Ocean (3); Data Downscaling (2); Coupled system (1); Sea Level (1); River (1)
    • ML Usage: Process emulation (21); Calibration (7); Model emulation (5); Analysis/Workflow including Downscaling (6); Initialization (2)

    It is notable that that six DOE labs are choosing to make Lab Directed Research and Development (LDRD) investments in this area. As one Lab leader describes it: with LDRD the Labs have a mandate to “skate to where the puck will be.” So, clearly ML is where many are seeing the strategic opportunities.

    What is the present state of ML impacting E3SM capabilities?

    The relationship between E3SM and ML is at its early stages. There are not yet any machine-learned data-driven processes in production versions of E3SM. AI-based parameter tuning is more mature; so-called autotuning was used to create realistic configurations of E3SMv3 with high and low sensitivity to CO2 increases as part of the phase 3 Water Cycle campaign. Autotuning has also been explored for the Simple Cloud Resolving E3SM Atmosphere Model (SCREAM) and the E3SM Land Model, choosing parameters to better capture the QBO (Quasi-Biennial Oscillation) in the E3SM Atmosphere Model, and in picking efficient solver settings in the MALI Land Ice model. We also have made advances in using ML to emulate our component or fully-coupled model, with a timestep-by-timestep emulator of the E3SMv3 atmosphere model to be released soon and a coupled E3SMv3 emulator under development.

    Open research and implementation questions remain on where ML is the appropriate tool to advance E3SM’s missions. Most of these are not unique to Earth system modeling, and are being broadly faced by the Scientific Machine Learning community.

    • Will ML emulators trained on current and historical data capture future conditions seen in the Earth system?  (ML is bad at extrapolating.)
    • Will ML process emulators trained offline – even when they fit the training data very well – lead to stable and accurate predictions when incorporated in the fully-coupled system model?
    • Even if an ML emulator provides excellent predictions, what scientific understanding can we extract about the physical processes it represents?
    • How can AI/ML accelerate and improve the robustness of the larger scientific workflow, where E3SM is a part of a larger ecosystem of models? This may include tools to ensure consistency and compatibility of data sets.

    How is the E3SM project preparing for a future with ML?

    The E3SM leadership team is undertaking a strategic planning to guide the E3SM project’s goals and priorities over the next 10-year time frame. While the plan is not yet finalized, we can here give a preview of the project’s AI/ML strategy, which was informed by the survey results in Table 1.

    E3SM’s AI/ML strategy consists of five pillars. Each pillar is distinct in its goals and implementation details. Taken together, all five pillars are necessary to excel at being a DOE model for DOE missions on DOE computers.

    1. Full Model Emulation: We will develop emulators for our physics-based E3SM code releases which will capture the model behavior and statistics for many downstream scientific investigations, several orders of magnitude faster than running the model itself.
    2. Initial Conditions: As we prepare for missions on the subseasonal to decadal time frame, the utility of the model for these shorter time horizons will have an increasing dependence on accurate initial conditions. ML can both accelerate data assimilation in matching observations as well as spin-up (equilibration) of our components.
    3. Data-driven Process Emulation: The bias in E3SM predictions is largely due to incomplete representation of sub-grid physical processes.  ML emulators of these processes – either trained against observations or offline high-fidelity process models – have the promise to improve the fidelity of E3SM predictions.
    4. Calibration: Uncertain parameters in our process representations are adjusted to decrease biases in the model predictions against observational data sets. ML surrogates can accelerate optimization-based approaches to select parameters.
    5. Workflow Automation: AI has the potential to accelerate the scientific workflow of E3SM and its partner programs, just as we see AI improving productivity across the economy. Opportunities include creating an integrated assessment workflow, automating downscaling, generating documentation, and analyzing output.

    In the Table of existing ML efforts, the final column indicates a mapping from that activity to one of these five pillars.

    While we can’t predict exactly where we will be in 10 years, it will certainly be interesting to look back on these exciting and uncertain early stages of the relationship between E3SM and AI/ML.

    Send this to a friend