Heroic Bug Fixes – The Mysterious Case of Missing Clouds
Bugs are an inevitable part of any complex software project, and E3SM is no exception. A lot of time goes into finding and fixing bugs, the resulting impacts can rival major parameterization changes, but these efforts and their impacts frequently go unreported. Heroic Bug Fixes is a recurring column that celebrates the critical yet often overlooked work of debugging. We hope that by shining a well-deserved spotlight on this critical work we can inspire further debugging efforts across the community and provide the broader E3SM community with timely information about changes which could aid their own development and investigations.
Early results from major SCREAM/EAMxx campaigns, including the Cess experiments and the decadal run, uncovered a number of puzzling features. Among them were 2-meter temperatures over the subtropics that bore little resemblance to the prescribed sea surface temperatures and an unusually strong land–sea temperature contrast. At first, the EAMxx evaluation team suspected these quirks were rooted deep within the model’s parameterizations, a notoriously tricky place to look for problems. As it turned out, the real culprit was lurking far outside the usual suspects, elusive and tucked away where no one was looking.
Months after these oddities came to light, another mystery surfaced, one that initially seemed unrelated. As part of a separate study, the EAMxx team set out to investigate the model’s sensitivity to resolution in low cloud regimes. To isolate the issue more effectively, they turned to the efficient doubly periodic configuration of EAMxx (DP-EAMxx). They set up a series of idealized simulations from the CGILS experiment, covering three classic low-cloud regimes: stratus, stratocumulus, and cumulus. But right out of the gate, something was off. The stratus case, expected to show a thick blanket of persistent clouds, came up completely empty. Not a single cloud in sight. What followed was a months-long hunt for the missing clouds.
Suspect number one: the DP-EAMxx forcing routines, recently rewritten in C++/Kokkos. A fresh code path is always a prime candidate for bugs, and this seemed like the logical place to start. But after weeks of combing through the logic, inputs, and outputs, this trail went cold. Next on the list was SHOC, the model’s boundary layer parameterization, which significantly influences low cloud regimes. Although SHOC had shown consistent performance in past studies for this regime, its seemingly sudden failure to produce clouds in such a well-understood case was unsettling. The team tuned SHOC, twisted knobs, adjusted routines—but the sky remained clear. No matter what they threw at the problem, the clouds refused to materialize.
On the brink of declaring defeat and concluding that DP-EAMxx simply couldn’t handle the case, the researchers finally got a break they’d been hoping for. A test simulation, this time prescribing surface sensible and latent heat fluxes derived from Large Eddy Simulation (LES)-mean values, finally yielded a realistic, cloud-filled solution. And just like that, attention snapped to surface coupling. Zooming in on the first few hours of the simulation and outputting fields every timestep, the team spotted strange 2-meter temperatures, values that didn’t reconcile with the Sea Surface Temperature (SST) or the near-surface air temperature.
That was the crack in the case. It turned out that EAMxx had been feeding the surface coupler an inconsistent bottom-layer potential temperature, a key variable in computing surface fluxes over ocean columns. And the source? Ironically, this inconsistency stemmed from a well-intentioned EAMxx effort to unify the Exner function across all internal atmospheric processes. However, this logic needed a special case when it came to the surface coupling interface. Without it, the surface fluxes-and therefore the cloud evolution—were off, and sometimes in a big way.
The fix? A one-line correction. But its impact was big. The clouds returned in the CGILS case. The 2-meter subtropical temperature anomaly seen in the CESS/Decadal simulations vanished (Fig. 1). The exaggerated land–sea contrast faded. Even coastal stratocumulus, long a thorn in the side of many earth system models, looked better. This was more than just a bug fix; it was a reminder that sometimes the hardest problems aren’t where you expect them to be. And that process-level testing, even in idealized setups, can uncover hidden issues lurking deep within complex systems.
And so, the case of the missing clouds was finally closed… until the next bug crawls in.

Figure 1. (top row) Two meter temperature bias of EAMxx run at ne256 (~12 km) resolution for the (left) control model and (right) model configuration that contains the bug fix. (bottom row) Difference between the bug fix and control simulation. Note that bias reduction is nearly global, especially over the subtropical low cloud regions, high latitudes, and land. However, with the bug fix the bias becomes worse over the tropical Pacific warm pool.
