Accurate and high resolution spatial soil information is essential for efficient and sustainable land use, management and conservation. Since the establishment of digital soil mapping (DSM) and the goals set by the GlobalSoilMap (GSM) working group, great advances have been made to attain spatial soil information worldwide. Highly populated areas such as the Netherlands demand multi-functional land use, for which information of key soil properties such as pH is essential to make decisions. We a) provide soil pH prediction maps at six standard depth layers between 0 m to 2 m for the Netherlands at 25 m resolution, whereby the calibrated Quantile Regression Forest (QRF) model allows for prediction at any desired depth, and b) determine map accuracy using various statistical validation strategies and evaluation of prediction uncertainty. This study is unique among GSM products by including design-based inference of a probability sample as an external accuracy assessment and providing Tier 4 maps with spatially explicit accuracy thresholds for end-users based on GSM specifications.
QRF models were tuned and calibrated using 15 338 soil observations between 0 m and 2 m depth from 4230 locations and 195 covariates representing the soil-forming factors. The following statistical validation strategies were used for external accuracy assessment of map quality: out-of-bag, location-grouped 10-fold cross-validation, an independent validation set (5677 observations, 1367 locations) and a stratified random sample of the independent validation set separated by depth layer. Mean error (ME), root mean squared error (RMSE), model efficiency coefficient (MEC) and the prediction interval coverage probability (PICP) were calculated in all four strategies. In addition, the 90th prediction intervals were used to categorize each 25 m pixel into “none”, A, AA or AAA quality as a measure of the internal accuracy assessment.
We obtained large differences depending on the four external accuracy assessment strategies and depth layer (ME = −0.08–0.20, RMSE = 0.41–0.83, MEC = 0.64–0.90, PICP of PI90 = 0.80–0.94). Design-based inference (LSK-SRS) was most indicative of map accuracy based on sampling theory (ME = 0.09–0.17, RMSE = 0.7–0.79, MEC = 0.73–0.82). QRF prediction uncertainty was slightly overestimated. Less than 10 % of pixels were designated with AA and AAA and therefore we recommend future studies to also test the achievability of high quality thresholds for Tier 4 GSM maps. We believe these 3D soil pH maps at 25 m resolution are useful for a variety of end users and that our workflow can be applied elsewhere and for other soil properties to further diminish the gap of missing spatial soil information.