by Sándor Baran and Annette Möller (Faculty of Informatics, University of Debrecen and Institute of Mathematics, Technical University of Clausthal)

Statistical calibration of ensemble weather forecasts is a rapidly developing research area of statistics as well as atmospheric and water sciences. We are developing and implementing multivariate approaches explicitly accounting for dependencies between weather observation locations and/or between weather variables, including temperature, precipitation and pressure.

Capturing and modelling uncertainty is essential in any forecasting problem, and, in weather or hydrological predictions, can have enormous economic benefits. The early 1990s saw an important shift in the practise of weather forecasting, from deterministic forecasts obtained using numerical weather prediction (NWP) models in the direction of probabilistic forecasting. The crucial step was the introduction of ensemble prediction systems (EPSs), which came into operational use in 1992, both at the European Centre for Medium-Range Weather Forecasts (ECMWF) and the U.S. National Meteorological Center.

An NWP model essentially consists of a large and complex set of partial differential equations describing the processes that take place in the planet’s atmosphere. An EPS provides not only a single forecast but a range of several forecasts, which are usually generated by running the NWP model multiple times, each time based on another set of assumed initial values obtained by random perturbation of an initial guess derived from the actually available information about the state of the atmosphere.

In recent decades, the ensemble method has become widely used globally, its appeal lying in its ability to easily provide statistical summary measures that explicitly reflect the forecast uncertainty. However, the raw outputs of the EPS often exhibit systematic forecast errors (bias) or cannot properly capture the forecast uncertainty (calibration), thus calling for some form of post-processing. Simple approaches to bias correction or calibration have a long history, and in the first years of the twenty-first century several more sophisticated methods appeared, including statistical models providing full predictive probability distributions of the weather variables at hand. This means that one is not only able to provide a forecast of tomorrow’s temperature in Berlin, but also to forecast the probability that tomorrow’s temperature in Berlin will be between 20 and 25oC with say 80% confidence. Starting with the fundamental works of Tilmann Gneiting and Adrian Raftery that introduced Bayesian model averaging and ensemble model output statistics for ensemble calibration [1], statistical post-processing of ensemble forecasts became a hot topic both in statistics and atmospheric sciences, resulting in a multitude of probabilistic models for different weather quantities, new methods and algorithms for training these models on real weather data and novel approaches to forecast verification [2].

The Hungarian-German research project “Statistical post-processing of ensemble forecasts for various weather quantities”, jointly financed by the Hungarian National Research, Development and Innovation Office and the Deutsche Forschungsgemeinschaft, aims to develop and test multivariate post-processing methods that model correlations between different weather variables, and/or incorporate correlations in space, e.g., between observation stations. Further goals are the development of user-friendly software packages for statistical calibration of ensemble weather forecasts, investigation of approaches that take advantage of local features in the neighbourhood of an observation station, and finally, to create an efficient scientific network.

Most members of the small research group have connections with Tilmann Gneiting’s research group at the Heidelberg Institute for Theoretical Studies. The members have a long history of collaboration and include mathematicians and statisticians from Heidelberg University, Karlsruhe Institute of Technology, Technical University of Clausthal, University of Debrecen, University of Hildesheim, and a meteorologist from MeteoSwiss Agency.

Figure 1: Mean continuous ranked probability score (CRPS) values of the various post-processing approaches and the raw forecasts for different lead times. The CRPS measures the goodness of fit of the predictive distribution to the corresponding observation, the smaller the better.
Figure 1: Mean continuous ranked probability score (CRPS) values of the various post-processing approaches and the raw forecasts for different lead times. The CRPS measures the goodness of fit of the predictive distribution to the corresponding observation, the smaller the better.

Since its inception in April 2018, the project has investigated various angles, including:

  • An approach to calibrate hydrological ensemble forecasts, such as water levels of rivers. The challenge when dealing with this type of data is that water level measurements exhibit natural bounds from below and above and are non-Gaussian, so appropriate data transformation schemes are required. The proposed model was applied to predict water levels for the river Rhine at Kaub gauge with great success.
  • An approach that accounts for the interdependency between weather variables over time. The temperature today is not independent of the temperature yesterday or other recently observed temperatures. Thus, the forecast errors on different days are not independent from one another. This time dependence was utilised to improve the ensemble forecasts, which worked particularly well for higher lead times when considering temperature forecasts.
  • An investigation of the applicability of machine learning approaches to statistical post-processing of ensemble forecasts of total cloud cover (TCC), a variable describing what fraction of the sky is covered by clouds, measured on a nine-point scale: 0, 1/8, 2/8, 3/8, 4/8, 5/8, 6/8, 7/8, 1. Using the ECMWF global TCC ensemble forecasts for the period 2002–2014 and considering different lead times, the predictive performance of multilayer perceptron (MLP) neural networks, gradient boosting machines (GBM) and random forest (RF) algorithms was compared with the forecast skill of the state-of-the-art multiclass- and proportional odds logistic regression (MLR and POLR) approaches and the raw ensemble (see Figure 1). To obtain more specific insight, a detailed simulation study was conducted to compare properties and performance of different multivariate post-processing methods (ensemble copula coupling, dual ensemble copula coupling, Schaake shuffle and the Gaussian copula approach, see ([2]) for different situations (e.g., weather variables). The main message gained from the simulation study was that a misspecification in the multivariate dependence structure in the post-processing model leads to a notable deterioration in forecast performance.

Finally, the researchers of our group also complemented existing post-processing software packages [L1] with the necessary algorithms to deal with different types of weather variables.


[1] T. Gneiting, A. E. Raftery: “Weather forecasting with ensemble methods”, Science 310 (2005), 248–249.
[2] S. Vannitsem, D. S. Wilks, J. W. Messner, (eds.): “Statistical Postprocessing of Ensemble Forecasts”, Elsevier, Amsterdam, 2018.

Please contact:
Sándor Baran
Faculty of Informatics, University of Debrecen, Hungary
This email address is being protected from spambots. You need JavaScript enabled to view it.

Next issue: October 2024
Special theme:
Software Security
Call for the next issue
Image ERCIM News 121 epub
This issue in ePub format

Get the latest issue to your desktop
RSS Feed