CAO Kai-xin, TANG Meng-meng, GE Jian-hong, LI Ze-kang, WANG Xiao-yun, LI Guoxing, WEI Xue-tao. Comparison of methods to interpolate missing PM2.5 values: Based on air surveillance data of Beijing[J]. Journal of Environmental and Occupational Medicine, 2020, 37(4): 299-305. DOI: 10.13213/j.cnki.jeom.2020.19740
Citation: CAO Kai-xin, TANG Meng-meng, GE Jian-hong, LI Ze-kang, WANG Xiao-yun, LI Guoxing, WEI Xue-tao. Comparison of methods to interpolate missing PM2.5 values: Based on air surveillance data of Beijing[J]. Journal of Environmental and Occupational Medicine, 2020, 37(4): 299-305. DOI: 10.13213/j.cnki.jeom.2020.19740

Comparison of methods to interpolate missing PM2.5 values: Based on air surveillance data of Beijing

  • Background Air pollutant data from ground monitoring sites are increasingly being applied for individual exposure assessment in environmental epidemiology. For research based on historical monitoring data, due to the impossibility of remeasurement for missing values, the prediction errors caused by different interpolation methods will affect the final interpretation.
    Objective This study compares the accuracy and precision of six interpolation methods and provides insights into the measurement bias arising from exposure assessment in PM2.5-associated studies.
    Methods Based on the PM2.5 data observed at 35 monitoring sites in Beijing, the results from six interpolation methods (time-average, the nearest monitoring site, multiple linear regression, multivariate imputation, inverse distance weighted, and Kriging interpolation) were compared at three typical monitoring sites (Dongsi, Miyun, and Fangshan), respectively, using four statisticsmedian absolute error (MAE), median relative error, mean squared error, and root mean squared error (RMSE).
    Results Among the six interpolation methods, the optimal method at "Dongsi" monitoring site was multiple linear regression, followed by inverse distance weight, and the worst one was time-average; the RMSEs of the three interpolation methods were 6.67, 8.19, and 52.19, respectively; the MAEs were smaller than 4, except the value of 19.00 for time-average. At "Miyun" monitoring site, the optimal interpolation method was multiple interpolation, followed by Kriging, and the worst one was time-average; the RMSEs of the three methods were 8.34, 11.76, and 42.53, respectively; the MAEs were smaller than 5, except in the case of 16.00 for time-average. At the "Fangshan" monitoring site, the optimal interpolation method was Kriging, followed by multiple interpolation, and the worst one was time-average; the RMSEs of the three methods were 18.74, 22.73, and 50.93, respectively; the MAEs were smaller than 10, except in the case of 27.50 for time-average. Taking the three monitoring sites together, the optimal method was Kriging, followed by multiple interpolation, and the worst one was time-average; the RMSEs were 13.65, 14.77, and 48.74, respectively; the MAEs were smaller than 5, except in the case of 19.00 for time-average.
    Conclusion Among the six interpolation methods, Kriging and multiple interpolation methods are the best, while time-average is the worst. Kriging interpolation method shows a more stable performance than inverse distance weight. Except time-average, the average prediction error of each method is within 5. Factors like surveillance density and topography may influence interpolation efficiency.
  • loading

Catalog

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return