Best Paper Award at SC25: AI-driven Weather Prediction with ORBIT-2
Congratulations to E3SM members Dali Wang and Peter Thornton, the E3SM ecosystem collaborators Xiao Wang and Dan Lu, and the entire ORBIT team for winning the 2025 Best Paper Award at SC25!
Using ORNL‘s Frontier supercomputer, the team pushed the boundaries of what’s possible in AI-driven weather prediction with ORBIT-2, a next-generation Oak Ridge Base Foundation Model for Earth System Predictability.
The research delivers ultra-detailed forecasts, with predictions that run in milliseconds with ~99% accuracy, and their findings are described below.
Key takeaways
HPC impact
- Exascale ViT training: ORBIT-2 scales to 10B parameters on 65,536 GPUs on Frontier, sustaining up to 4.1 ExaFLOPS with 74–98% strong-scaling efficiency.
- Breaks the long-sequence barrier for vision: With TILES, ORBIT-2 processes up to 4.2B tokens, orders of magnitude beyond prior ViT sequence-length limits.
- Efficiency by design: Reslim (Residual Slim ViT) combines lightweight residual learning with Bayesian regularization, delivering up to 660× faster training (and large energy-efficiency gains) versus conventional ViTs on representative downscaling tasks.
E3SM impact
- Global hyper-resolution downscaling: Enables downscaling toward 0.9 km global resolution.
- High fidelity vs observations: Achieves R² ≈ 0.98–0.99 on 7 km benchmarks (variable-dependent).
- Fast inference: Enables near–real-time global downscaling (milliseconds to sub-second) on a small number of GPUs, supporting rapid evaluation and potential operational pathways.
Core technical innovations
- Reslim architecture: avoids expensive upsampling in the main ViT path while controlling uncertainty via residual learning + Bayesian regularization—maintaining accuracy at far lower cost.
- TILES algorithm: converts ViT attention scaling from quadratic → (effectively) linear via tilewise attention with halo padding, enabling massive parallelism and ultra-long sequences.
Deployment path
- E3SM ecosystem integration: ORBIT-2 supports E3SM’s goal of decision-relevant, high-resolution Earth system intelligence, and the team is actively integrating ORBIT-2 into the NVIDIA Earth-2 ecosystem to broaden downstream use and deployment.
Background: why downscaling matters for E3SM
The Energy Exascale Earth System Model (E3SM) and the broader Department of Energy (DOE) Earth system modeling community are increasingly asked to provide decision-relevant weather intelligence: localized extremes, risk metrics, and actionable information for energy systems, infrastructure resilience, water resources, and hazard preparedness. However, many operational and planning needs require fine spatial detail that is not directly available from global models or sparse observational networks.
Downscaling bridges this gap by translating coarser global information into high-resolution regional detail. Traditional dynamical downscaling can be computationally expensive, and many AI downscaling approaches struggle to generalize across regions/variables and often hit hard scalability limits—especially when based on Vision Transformers (ViTs), where standard self-attention scales quadratically with sequence length.
What is ORBIT-2?
ORBIT-2 (Oak Ridge Base AI foundation model for Earth System Predictability) is a scalable AI foundation model designed for global downscaling of Earth system variables to hyper-resolution, with an emphasis on both scientific fidelity and high performance computing (HPC) scalability. In collaboration with AMD, ORBIT-2 was developed and demonstrated at extreme scale on the Frontier exascale system.
ORBIT-2’s overarching goal is to move from “one-off” downscaling models to a generalizable foundation model that can transfer across variables, geographies, and resolutions—while remaining computationally feasible as resolutions approach the kilometer scale.
Innovation 1: Reslim (Residual Slim ViT) for efficient, uncertainty-aware downscaling
A central challenge in downscaling is that it is an ill-posed inverse problem: many plausible fine-scale fields can correspond to the same coarse input. Prior ViT downscaling pipelines often upsample inputs to reduce uncertainty. Upsampling takes a low-resolution map and expands it into a higher-resolution one by adding many new data points in between existing ones. While this can reduce uncertainty, it increases token counts and makes ViTs prohibitively expensive.
Reslim takes a different route: it avoids costly upsampling in the main ViT path, operating on compressed representations to drastically reduce sequence length and compute, while still controlling uncertainty via:
- a lightweight residual learning path, and
- a Bayesian formulation / regularization to improve robustness and spatial consistency.
This architectural choice is a key reason ORBIT-2 can push toward hyper-resolution without the typical ViT cost explosion.
Innovation 2: TILES for tilewise, linear-scaling attention
Even with efficient designs, Vision Transformer models become prohibitively expensive when they process very long sequences, because every part of the data must compare itself with every other part. ORBIT-2 overcomes this by introducing TILES (Tile-wise Efficient Sequence Scaling), which breaks large spatial inputs into smaller tiles that can be processed independently. Attention is computed within each tile and then combined across tiles, while halo padding adds a small overlap of neighboring information so boundary effects are handled accurately. This approach reduces computational cost from growing explosively with resolution to growing nearly linearly, and enables massive parallel processing across GPUs.
Exascale implementation and performance on Frontier
ORBIT-2 couples TILES (sequence scaling) with orthogonal model-parallel strategies (e.g., data parallelism, sharding, tensor parallelism) to scale both sequence length and model size. In reported SC25 results, ORBIT-2 scaled to 10B parameters on 65,536 GPUs, sustaining up to 4.1 ExaFLOPS in BF16 with 74–98% strong scaling efficiency on Frontier.
This matters to E3SM because it demonstrates a practical path to train and serve AI models on DOE leadership systems at the scale demanded by global Earth system applications, not just small research testbeds.
Science results: fidelity against observations
ORBIT-2 reports strong accuracy on 7 km downscaling benchmarks, with R² in the ~0.98–0.99 range depending on variable and evaluation setting, when compared against observational references used in the study.

Figure 3. Example downscaling from ERA5 precipitation at 28km resolution to 7 km resolution. Watch the full 3.5 minute video: ERA5 input at 28km and ORBIT-2 downscaled output at 7km.
Community recognition
ORBIT-2’s received SC’25 Best Paper Award, and was selected as ACM Gordon Bell Prize finalist twice in 2024 and 2025, and won the HPC Wire Supercomputing Achievement Award in 2024.
Path to impact: integration with NVIDIA Earth-2 and the E3SM ecosystem
Beyond research results, the team is actively working to integrate ORBIT-2 into the NVIDIA Earth-2 ecosystem product to accelerate downstream accessibility, workflow integration, and evaluation in applied settings (e.g., near-real-time inference and scalable deployment patterns). This effort complements E3SM’s mission by connecting exascale-trained AI capability to broader user communities and decision workflows.
What’s next
Near-term directions include:
- Expanding the work into E3SM land modeling.
- Improving uncertainty characterization and bias handling for deployment-relevant settings.
Acknowledgments
This work was funded by ORNL AI Initiative, and supported by Frontier computing resources at the Oak Ridge Leadership Computing Facility, and collaborations spanning ORNL, AMD, and Nvidia.
Reference
Xiao Wang, Jong-Youl Choi, Takuya Kurihaya, et al. 2025. https://doi.org/10.1145/3712285.3771989
Wang, Xiao, Aji, Ashwin, Choi, Jong, et al. 18 Oct. 2025. Web. doi:10.11578/dc.20251018.1.
Contact
- Xiao Wang: wangx2@ornl.gov
- Dali Wang: wangd@ornl.gov
- Peter Thornton: thorntonpe@ornl.gov
- Dan Lu: lud1@ornl.gov

