Dealing with clustered samples for assessing map accuracy by cross-validation

Bruin, Sytze De; Brus, Dick J.; Heuvelink, Gerard B.M.; Ebbenhorst Tengbergen, Tom van; Wadoux, Alexandre M.J.C.


Mapping of environmental variables often relies on map accuracy assessment through cross-validation with the data used for calibrating the underlying mapping model. When the data points are spatially clustered, conventional cross-validation leads to optimistically biased estimates of map accuracy. Several papers have promoted spatial cross-validation as a means to tackle this over-optimism. Many of these papers blame spatial autocorrelation as the cause of the bias and propagate the widespread misconception that spatial proximity of calibration points to validation points invalidates classical statistical validation of maps. We present and evaluate alternative cross-validation approaches for assessing map accuracy from clustered sample data. The first method uses inverse sampling-intensity weighting to correct for selection bias. Sampling-intensity is estimated by a two-dimensional kernel approach. The two other approaches are model-based methods rooted in geostatistics, where the first assumes homogeneity of residual variance over the study area whilst the second accounts for heteroscedasticity as a function of the sampling intensity. The methods were tested and compared against conventional k-fold cross-validation and blocked spatial cross-validation to estimate map accuracy metrics of above-ground biomass and soil organic carbon stock maps covering western Europe. Results acquired over 100 realizations of five sampling designs ranging from non-clustered to strongly clustered confirmed that inverse sampling-intensity weighting and the heteroscedastic model-based method had smaller bias than conventional and spatial cross-validation for all but the most strongly clustered design. For the strongly clustered design where large portions of the maps were predicted by extrapolation, blocked spatial cross-validation was closest to the reference map accuracy metrics, but still biased. For such cases, extrapolation is best avoided by additional sampling or limitation of the prediction area. Weighted cross-validation is recommended for moderately clustered samples, while conventional random cross-validation suits fairly regularly spread samples.