A global land cover validation dataset, I: Fundamental design principles

Olofsson, P.; Stehman, S.; Woodcock, C.; Sulla-Menashe, D.; Sibley, A.; Newell, J.; Friedl, M.A.; Herold, M.


A number of land-cover products, both global and regional, have been produced and more are forthcoming. Assessing their accuracy would be greatly facilitated by a global validation database of reference sites that allows for comparative assessments of uncertainty for multiple land-cover data sets. We propose a stratified random sampling design for collecting reference data. Because the global validation database is intended to be applicable to a variety of land-cover products, the stratification should be implemented independently of any specific map to facilitate general utility of the data. The stratification implemented is based on the Köppen climate/vegetation classification and population density. A map of the Köppen classification was manually edited and intersected by two layers of population density and a land water mask. A total of 21 strata were defined and an initial global sample of 500 reference sites was selected, with each site being a 5¿×¿5 km block. The decision of how to allocate the sample size to strata was informed by examining the distribution of the sample area of land cover for two global products resulting from different sample size allocations to the 21 strata. The initial global sample of 500 sites selected from the Köppen-based stratification indicates that these strata can be used effectively to distribute sample sites among rarer land-cover classes of the two global maps examined, although the strata were not constructed using these maps. This is the first article of two, with the second paper presenting details of how the sampling design can be readily augmented to increase the sample size in targeted strata for the purpose of increasing the sample sizes for rare classes of a particular map being evaluated.