Colloquium

Monitoring soil perturbation near gas pipelines using machine learning

Organisator Laboratory of Geo-information Science and Remote Sensing
Datum

do 14 december 2017 14:00 tot 14:30

Locatie Lumen, building number 100
Droevendaalsesteeg 3a
100
6708 PB Wageningen
+31 317 481 700
Zaal/kamer 1

By Gert Sterenborg (the Netherlands)

Summary
The Dutch GASUNIE NV is responsible for the natural gas infrastructure in the Netherlands and Northern Germany. This company wants to know where soil perturbation is performed, to localise the risk of damaged underground gas pipelines. Most of the high pressure pipelines lay in rural areas, where the surface of the soil is used for agriculture. Soil perturbation activities are activities that tamper the soil, often with machinery, which may cause damage to pipelines. This research is only covering agricultural fields because good validation data was only available for specific agricultural fields. The validation data holds information about when and where soil perturbation activities like ploughing took place. High temporal Sentinel 2 multispectral satellite imaging is used as an instrument to identify soil perturbation. This source is used because the use of satellite data can be automated to generate an automatic soil perturbation system based on this research. This research only focuses on the centroid of a agricultural parcel. To find a suitable centroid and to minimise the mixing pixel effect, a script has been developed to find a point furthest away from any field boundary. Two classification trees are generated using a machine learning method: Random Forest. The first is used to classify pixels as usable or not. Usable pixels are pixels where soil perturbation can be visually identified, so no clouds, snow or shadow is located on the pixel. In this research, after training on 2116 pixels this classification tree is correct in 98.91% of 386 validation data points. The second classification tree is generated to classify soil perturbation, this tree was established using 973 training data points and was correct in 98.15% of the 324 validation data points.