by Daan Crommelin (CWI; Korteweg-de Vries Institute for Mathematics, University of Amsterdam), Wouter Edeling (CWI) and Fredrik Jansson (CWI)
The atmosphere and oceans are key components of the climate system, each involving a wide range of space and time scales. Resolving all relevant scales in numerical simulations with atmosphere/ocean models is computationally not feasible. At CWI, we are tackling this longstanding multiscale challenge by developing new algorithms, including data-based and stochastic methods, to represent small, unresolved scales.
Simulating the climate system on a computer presents a formidable challenge. A major difficulty is the multiscale nature of the key components of the climate system - the atmosphere and oceans. They possess physical and dynamic processes that occur across a range of spatial and temporal scales. Some aspects of the global atmospheric circulation operate at the planetary scale, in the order of 104 km, yet it is also significantly affected by atmospheric convection and cloud formation, processes taking place at scales of order 10-100 m. Similarly disparate scales play a role in oceanic circulation.
Resolving all these scales at once in numerical simulation is computationally unfeasible. Therefore, global models employ simplified representations, or “parameterizations” of the effect that the unresolved processes have on the resolved-scale processes. Formulating such parameterizations is difficult, and the limitations of common, existing methods of doing so are well-known. The uncertainties and errors of parameterizations are a major source of uncertainty in climate change simulations (e.g., through uncertainties in the cloud-climate feedback); see also  for more background information and references.
New methods and approaches for parameterization are vital. In the Scientific Computing group at CWI, we are working on this topic along two related research lines. One is focused on superparameterization, a computational approach to multiscale modelling and simulation of atmosphere and ocean, in which high-resolution local models (i.e., which cover a small area) are nested in the model columns (vertically stacked numerical discretization boxes) of a coarse-resolution global model that covers the entire earth. Importantly, it concerns a two-way nesting, in which the global model state drives the local models while the local models also feed back onto the global model. It effectively replaces traditional parameterizations based on physical insights and intuition by a computational model based on first principles.
Superparameterization is computationally very expensive, as in principle it would involve high-resolution local models that collectively cover the entire earth (one local model nested within each global model column). The set-up is very well suited for massive parallelization, because the local models do not directly interact with each other; only with the global model. Notwithstanding, it is still much too expensive run roughly 105 local models in parallel, each with a horizontal domain of, say, 25 km x 25 km and grid resolution of 50 m so that they can resolve atmospheric convection and cloud formation explicitly. To reduce computational costs, previous superparameterization studies have reduced either the grid resolution or the domain size of the local models. In a joint project between CWI, the Netherlands eScience Center and Delft University of Technology, we have taken a different approach and developed a method in which the local models are only nested in a selected geographical region . The selection is flexible and made by the user, based on factors such as research interest and available computational resources. Outside the selected region, traditional parameterizations are used.
Comparing results from simulations with and without superparameterization, clear differences were observed in the height and vertical extent of the cloud layers. Using superparameterization resulted in higher clouds, in good agreement with ground-based LIDAR observations. Figure 1 (reproduced from ) shows a snapshot of the modelled cloud fields over the Netherlands next to a satellite image.
Figure 1: Superparameterized weather simulation over the Netherlands, compared to a satellite image from Terra/MODIS. Each blue tile represents one local, cloud-resolving model, connected to a global model (shown as the purple background). From ref. .
In another research line at CWI, we are developing methods to train a data-based parametrization scheme using data from high-resolution models, such as the local models used in superparameterization, or from observations. By inferring or training parameterizations from such data one can circumvent ad-hoc physical assumptions for formulating parameterizations while also avoiding running high-resolution models for the entire duration of climate simulations (although clearly, a certain amount of computational effort is needed to generate the training data, unless these can be obtained from observations).
Our focus is on data-based methods for stochastic parameterization. The feedback from unresolved scales is intrinsically uncertain (e.g., because of chaotic dynamics) and this uncertainty can be represented with stochastic methods for parameterization [2,3]. In , we explored several methods to parameterize unresolved scales with stochastic models trained from data. The methods were tested on a multiscale test model (the Kac-Zwanzig heat bath model) that has its origins outside the climate domain yet forms a suitable test bed. One approach, making use of data resampling (or bootstrapping) for parameterization, was shown to be particularly effective.
Building on the results from , we use the resampling approach for parameterizing subgrid scales in a simple ocean model in , with positive results. The data-driven parameterization approach can be viewed as a methodology for surrogate modelling, as the parameterization is meant to replace (i.e., serve as a surrogate of) the expensive high-resolution model that generated the data. Stochastic (as opposed to deterministic) methods for surrogate modelling have not been explored much to date; our resampling approach is such a stochastic surrogate modelling method. We are currently making this approach part of the software toolkit VECMAtk, under development in the EU H2020 project VECMA (Verified Exascale Computing for Multiscale Applications). Furthermore, we are strengthening our methods by using machine learning methods to build stochastic surrogates with wider capacity.
This research was supported by the Netherlands eScience Center, by the Dutch Research Council (NWO) through the Vidi project “Stochastic models for unresolved scales in geophysical flows” and by the European Union Horizon 2020 Research and Innovation Programme under grant agreement #800925 (VECMA project). Furthermore, we acknowledge the use of ECMWF’s computing and archive facilities.
 F. Jansson et al.: “Regional superparameterization in a global circulation model using Large Eddy Simulations”, JAMES, Vol. 11 (2019), 2958-2979.
 N. Verheul, D. Crommelin: “Data-driven stochastic representations of unresolved features in multiscale models”, Comm. Math. Sci, Vol. 14 (2016), 1213 – 1236.
 W. Edeling, D. Crommelin: “Towards data-driven dynamics surrogate models for ocean flow”, in: Proc. of PASC 2019, ACM.
CWI, The Netherlands