Improving the efficiency of Monte Carlo uncertainty analysis in R

Organisator Laboratory of Geo-information Science and Remote Sensing

vr 15 april 2016 10:00 tot 10:30

Locatie Gaia, building number 101
Droevendaalsesteeg 3
6708 PB Wageningen
+31 317 48 16 00
Zaal/kamer 1

By Stefan van Dam (the Netherlands)


Making decisions is part of everyday life. Models can help in making decisions. Uncertainty in model inputs, however, propagates through the model and therefore also causes uncertainty in the output. It is important to quantify this uncertainty. A way to assess how uncertainty propagates is with the Monte Carlo method. This methods works as follows; a model is run repeatedly, each time with a different set of inputs and parameters which are randomly sampled form the probability distributions. However, this method is very time consuming and more so with more complex models. The objective of this research was to investigate ways to improve how the Monte Carlo method can be used in the most efficient way in conjunction with the open-source R programming language. Two ways of improving the efficiency have been explored: 1) making use of efficient sampling methods: Stratified Random Sampling and Latin Hypercube Sampling, and 2) utilizing parallel computing. Currently, a package is in development in the R programming language which can run Monte Carlo analyses to analyse spatial uncertainty propagation. As a contribution to this package, the two efficient sampling methods and parallel computing have been implemented and made available as callable functions. Each method was assessed on a test case. The improved efficiency for the more efficient sampling methods was calculated empirically and analytically. The improved efficiency for parallel computing was assessed as the decrease in CPU time when parallel computing is applied. For the test case used in this research, Stratified Random Sampling and Latin Hypercube Sampling needed up to three times as few samples than Simple Random Sampling to achieve the same accuracy of the estimation of the uncertainty in the output. Applying parallel computing to the test case used in this research, the model runs were executed 5.4 times as fast on average than when not applying parallel computing.

Keywords: Monte Carlo Analysis, R Package, Stratified Random Sampling, Latin Hypercube Sampling, Parallel Computing