Machine Learning Helps Improve Groundwater Representation in Earth System Models

Riparian vegetation sustained by shallow groundwater. Image courtesy of Camerafiend (talk) (Uploads) – Own work, CC BY-SA 3.0, File:BosqueNM.jpg – Wikimedia Commons
Calibration of groundwater table depth revealed tradeoffs in modeling runoff, soil moisture, and land–atmosphere feedbacks.
The Science
Groundwater table depth (GWTD) influences how water moves through soils, rivers, and ecosystems, and it affects exchanges of water and energy between land and atmosphere. However, Earth system models (ESMs) often struggle to represent groundwater accurately at global scales because groundwater interacts with soil moisture, runoff, evapotranspiration, and atmospheric processes across multiple spatial and temporal scales.
The Impact
This study provided a systematic test of how machine learning (ML) based calibration can improve groundwater simulations in an ESM while also revealing the limits of single-variable calibration strategies. The work demonstrated that ML can efficiently calibrate model parameters that would otherwise require computationally expensive global simulations. However, improving GWTD alone does not necessarily improve related hydrologic variables such as runoff, soil moisture, or evapotranspiration. The findings highlighted the importance of interactions among hydrologic processes in ESMs. They also showed that model responses differed between offline (that is, uncoupled/stand-alone) land simulations and coupled land–atmosphere simulations, where feedbacks amplified the influence of groundwater calibration. These findings indicated that calibration strategies targeting a single hydrologic variable may not improve overall model behavior and highlighted the importance of integrated calibration approaches that account for multiple hydrologic processes simultaneously in coupled land-atmosphere models.
Summary

Figure 1. Model groundwater table depth (GWTD) performance comparison across five continents: (top) uncalibrated versus Fan2013 GWTD, (middle) calibrated versus Fan2013 GWTD, and (bottom) histogram comparison of uncalibrated (blue), calibrated (cyan), and Fan2013 (pink) GWTD data distributions. Calibration substantially improved agreement with observed groundwater depth distributions, particularly in regions where groundwater occurs within several meters of the land surface (notice the close fit of points to the lower left of the identity line in the middle row plots). NSE and KGE refer to the Nash-Sutcliffe model efficiency, a measure of predictive skill, and the Kling-Gupta efficiency, a measure of goodness-of-fit, respectively.
In this study, researchers developed an ML framework to iteratively calibrate GWTD in the Energy Exascale Earth System Model (E3SM) Land Model (ELM). The approach used surrogate modeling to emulate the relationship between model parameters and simulated GWTD, allowing efficient exploration of the parameter space. The calibration targeted five parameters related to surface runoff and subsurface drainage processes and used global groundwater model data from the global Fan2013 groundwater model dataset constrained by extensive observations as a benchmark. Model performance, from 30-year global simulations with the ELM at 0.5° resolution, was evaluated against multiple observation-based datasets for runoff, soil moisture, evapotranspiration, and baseflow. The ML calibration substantially improved simulated GWTD, particularly within the 1–5 meter depth range, where groundwater influences land surface energy fluxes (Fig. 1). However, the improved GWTD representation did not consistently improve other hydrologic variables such as soil moisture, runoff, or evapotranspiration in offline model simulations. In some regions, runoff and soil moisture simulations became less accurate, and the model tended to overestimate the baseflow index. The results showed that improvements in GWTD were strongest in humid regions where groundwater had limited influence on surface processes. In other words, although GWTD improved most within the depth range where groundwater affects land surface energy fluxes, those improvements occurred mainly in geographic regions that were not strongly water-limited, thus limiting their effect. Comparisons between offline land simulations and coupled land–atmosphere simulations further showed that groundwater calibration had stronger impacts when atmospheric feedbacks were included.
Publication
- Fang, Y. and Leung, L. R. Machine learning calibration of groundwater table depth in ELM: Impact on land surface hydrology and land–atmosphere fluxes. Journal of Advances in Modeling Earth Systems, 18, e2025MS005184. [DOI: 10.1029/2025MS005184]
Funding
- This research was supported as part of the Energy Exascale Earth System Model (E3SM) project, funded by the U.S. Department of Energy (DOE), Office of Science, Biological and Environmental Research, as part of the Earth System Model Development (ESMD) program area. The Pacific Northwest National Laboratory is operated for DOE by Battelle Memorial Institute under contract DE‐AC06‐76RL01830.
Contact
- L. Ruby Leung, Pacific Northwest National Laboratory
- Yilin Fang, Pacific Northwest National Laboratory
This article is a part of the E3SM “Floating Points” Newsletter, to read the full Newsletter check:
- E3SM Floating Points, May ’26: Looking Forward to the Summer All-Hands