Weather forecasting is a typical problem of coupling big data with physical-process models, according to Prof. ZHANG Pingwen, the corresponding author of a joint study by Peking University and the Institute of Atmospheric Physics of the Chinese Academy of Sciences.
Generally speaking, weather forecasting is a largely successful practice in the geosciences and, nowadays, it is inseparable from numerical weather prediction (NWP). However, because the outputs of NWP and observations contain different systematic errors, a "weather consultation" is an indispensable part of the process towards further improving the accuracy of forecasts.
The weather forecasting process (Image by LI Haochen)
"In fact, the theory-driven physical model and data-driven machine learning are complementary tools. Combining these two approaches, an intelligent weather consultation system can be built to assist the current manual process of weather consultation," says Prof. ZHANG. "One of the challenges linked with this is to build appropriate feature engineering for both types of information to make full use of the data."
To solve these problems, the researchers proposed the "model output machine learning" (MOML) method for simulating weather consultation, and their study was published in Advances in Atmospheric Sciences.
MOML is a post-processing method based on machine learning, which matches NWP forecasts against observations through a regression function. To test the new approach for grid temperature forecasts, the 2-m surface air temperature in the Beijing area was employed.
The MOML method, with different feature engineering, was compared against the ECMWF model forecast and modified model output statistics (MOS) method. MOML showed better numerical performance than the ECMWF model and MOS, especially for winter; the accuracy when using MOML increased by 27.91% and 15.52% respectively.
Weather consultation data are unique, and mainly include information contained in both NWP model data and observational data. They have different data structures and features, which makes feature engineering a complicated task. The quality of feature engineering directly affects the final result.
ZHANG’s group has proposed several feature engineering schemes following extensive numerical experiments. These schemes ensured the calculation efficiency and were employed in meteorological studies for the first time.
According to Prof. ZHANG, the MOML method allowed the observational data to directly participate in the calculation, and used both the high- and low-frequency information of the data to make the forecast results more accurate.
The MOML method proposed in this study could be applied to forecasting the weather during the upcoming 2022 Winter Olympics, hopefully providing more accurate, intelligent and efficient weather forecasting services for this international event.
Machine learning and deep learning offer diverse tools for weather forecasts in the era of big data, but there are still many challenges in practical applications.
"It is an important future research direction to incorporate weather forecast data and coupled models into a hybrid computing framework to explore and study the structure and features of observational and NWP data, and propose data-driven machine learning algorithms suitable for weather forecasting," said Prof. ZHANG.
This work is supported by the "Technology Winter Olympics" National Key Research and Development Program of China and the National Natural Science Foundation of China.
52 Sanlihe Rd., Beijing,