Improving Convection Trigger Functions Using Machine Learning
Background
General circulation models (GCMs) often rain too frequently and at reduced intensity compared to observed precipitation patterns. These deficiencies are conspicuously manifested in simulating the diurnal cycle of precipitation. These problems are known to be closely related to the convection trigger function, a set of conditions used to determine whether the convection will be activated at a given time in the convective parameterizations in GCMs. Traditional triggers suffer from large uncertainties and are ad hoc because the mechanism of deep convection occurrence is not fully understood.
Science
In this study, scientists implemented a novel deep convection trigger function using the XGBoost method (Fig. 1), which is a state-of-the-art machine learning (ML) classification model. Data used for training the ML trigger functions are from the long-term variationally constrained Atmospheric Radiation Measurement (ARM) forcing dataset (VARANAL) at its Southern Great Plains (SGP) site in the central US, and the Manaus (MAO) site in the Amazon basin. Eleven boreal summer seasons (June, July, August) from 1999 to 2009 are used for SGP, and two years of data are used for MAO from 2014 to 2015 that cover the Green Ocean Amazon (GoAmazon2014/15) field campaign. The ML models are evaluated by separately training for the two sites, as well as a joint (or unified) training that combines the data from both sites. The training dataset contains a number of large-scale predictors, such as surface heat fluxes, surface temperature, and relative humidity, convective available potential energy (CAPE), lifting condensation level, and convective inhibition—as well as the vertical profiles of temperature, specific humidity, wind shear, and advective tendencies.
The performance of the ML trigger is compared with four convective trigger functions commonly used in GCMs: CAPE, undilute CAPE, dilute dynamic CAPE (dCAPE), and undilute dCAPE (Fig. 2). The ML trigger substantially outperforms the four CAPE-based triggers in terms of the F1 score metric, widely used to estimate the performance of ML methods. The site-specific ML trigger functions can achieve, respectively, 91% and 93% F1 scores at SGP and MAO. The unified trigger (trained by the two SGP and MAO sites) also has a 91% F1 score, with virtually no degradation from the site-specific training, suggesting the potential of a global ML trigger function.
Impact
The ML trigger alleviates a GCM deficiency regarding the overprediction of convection occurrence, offering a promising improvement to the simulation of the diurnal cycle of precipitation. To obtain explicit knowledge from the black-box ML trigger functions, a series of augmented rules are derived from the ML trigger, which could be used to improve existing traditional CAPE-based triggers.
Publication
- Zhang, T, W Lin, A Vogelmann, M Zhang, S Xie, Y Qin, and J Golaz. 2021. “Improving Convection Trigger Functions in Deep Convective Parameterization Schemes Using Machine Learning.” Journal of Advances in Modeling Earth Systems 13(5). https://doi.org/10.1029/2020ms002365.
Funding
- This work was primarily supported by the Climate Model Development and Validation (CMDV) project and partially supported by the Energy Exascale Earth System Model (E3SM) project, funded by the US Department of Energy, Office of Science, Office of Biological and Environmental Research.
Contact
- Tao Zhang, Brookhaven National Laboratory