2025-12-04 –, Progress
Artificial intelligence has been transformative for earth and environmental sciences: nowadays this technique is a common instrument scientists’ toolbox. In the domain of meteorology, machine learning often displays superior accuracy compared to traditional computational methods. Even in weather prediction, where complex numerical PDE-solving codes have seen decades of development, graph neural networks and transformer architectures have proven to produce more skillful forecast at a fraction of the computational cost. Inspired by the recent developments in generative modeling of textual data through large language models, several research groups have made efforts to design a foundation model for weather and climate, one that allows fine-tuning for specific objectives and benefits from a pre-trained rich latent space. The WeatherGenerator EU project aims to develop the leading European AI foundation model of the atmosphere. This model will be pre-trained with petabytes of multi-modal data (reanalyses, station observations, satellite products,…) on Europe’s first exascale-class supercomputers, ultimately keeping Europe’s global forecast capabilities at the forefront as we enter an era of democratized data-driven weather prediction.
In recent years, artificial intelligence has grown to be a ubiquitous tool in earth and environmental sciences. In meteorology and climate sciences, neural networks have shown to be the superior strategy for a multitude of data-driven tasks such as bias correction, down-scaling and even now-casting. Lately, also weather prediction and data assimilation - traditionally the domain of state-of-the-art large numerical HPC codes - have shown substantial improvements by using graph neural networks or transformer architectures. As a result, the current best weather forecasts are obtained with models such as Google’s graphcast and the ECMWF’s AIFS, both trained on the global reanalysis dataset ERA5. As a bonus, the inference rollout requires just a fraction of the computational cost of a traditional forecast.
Although machine learning outperforms traditional methods in these specific tasks, the question remains whether a unified core model, equipped with a rich latent space, opens the pathway towards improved predictive skill and increased flexibility. Several initiatives to build such a foundation model of the atmosphere have emerged and shown promising results. Within the EU project WeatherGenerator, we aim to construct a large, high-resolution foundation model for weather prediction and atmospheric climate modeling. We aim to combine a very large volume of reanalysis products, observational data and climate model output into a multi-channel transformer architecture that can easily be fine-tuned to execute common weather modeling and prediction tasks. The pre-training will be a technical feat that has to be executed on Europe’s exascale compute infrastructure. To substantiate the claim of being a foundation model, the project hosts many stakeholders that will re-implement existing ML applications with the WeatherGenerator model.
In this talk I will motivate this ambitious endeavour and outline the innovative ideas and techniques behind WeatherGenerator. I will briefly discuss some of the future applications and explain how the Netherlands eScience Center plans to bring this technology to potential stakeholders such as the European research community, public institutions and industry.
Gijs studied Theoretical Physics and Mathematics at Utrecht University. Thereafter, he did a PhD in Particle Physics at the Radboud University Nijmegen and Nikhef. Subsequently, he worked as a consultant in scientific software development on environmental models and hydrodynamical solvers at Deltares. Gijs joined the Netherlands eScience Center in 2016 and has primarily been involved in projects in weather, climate and hydrology. Gijs became head of the natural sciences & engineering section in 2024.a