Publications

Mapping of urban landuse and landcover with multiple sensors : Joining close and remote sensing with deep learning

Srivastava, Shivangi

Summary

According to the Food and Agriculture Organization of the United Nations, “landuse is characterized by the arrangements, activities, and inputs by people to produce, change or maintain a certain land cover type”1. Knowledge about landuse is important to e↵ectively plan and monitor resources, infrastructure, and services in a city. This thesis is about visualizing such information in the shape of a landuse map, which can serve local governments and decision makers to plan better cities. Traditionally a field based on visual survey, landuse mapping has nowadays embraced digital technology and in particular the use of remote sensing imaging. However, it is difficult to provide a fine-grained map, at the level of the single building, using remote sensing only.

In this thesis, I study the feasibility of using ground-based pictures for providing high- resolution land use maps. With large scale terrestrial pictures repositories pertaining to urban setting becoming available, landuse characterization maps at finer granularity seem to have higher feasibility. These pictures capture the frontal and  side  views  of urban objects and therefore can potentially lead to richer visual clues about the object. Moreover, many platforms with user uploaded content exist nowadays, such as Pixabay, Flickr, Geograph, Google Street View or Mapillary.

But to make sense of all these images, powerful methodologies are needed. In this thesis, I explore the use of new deep learning methodologies for the task of land use mapping from multiple data points of view (the ground and the aerial). Annotations required to train these models have been sourced from online public GIS vector databases at global scale like OpenStreetMap (OSM2), or at country scale as the Dutch Kadaster. To cope with situations where such data are missing, feature extraction and semantic segmentation strategies are explored.

The thesis is organized around four technical chapters. The first (Chapter 2) presents a method that uses several ground viewpoints of an urban object as defined in OSM, to train a model that characterizes landuse. The second (Chapter 3) explores whether top-view (aerial/satellite) imagery enhances the performance of the landuse classification model developed in Chapter 2. A multi-source (or multi-modal ) CNN model was developed over the region of Ile-de-France. It was also showed that the trained model could also be applied to another, structurally similar city (Nantes) without any  further  tuning.  In  the  third part (Chapter 4), I explore the possibility of predicting multiple land usages per building, which would lead to a more realistic map, where one  urban  object  can  be associated with several activities. The training and test of this approach were done over the city of Amsterdam. In the fourth and final part (Chapter 5), I studied model updates to multiple tasks as a way to update land maps (e.g. with building footprints) where elements are missing: I approached this problem as the one of dealing with “Catastrophic Forgetting”, a known issue that a↵ects CNNs trained for various tasks. Therefore, Chapter 5 focuses  on lifelong learning with a network pruning based approach and applies it to a challenging multi-cities dataset involving three di↵erent segmentation datasets from the DeepGlobe 2018 Challenge.

This thesis in the end successfully explores the feasibility of automatic map generation using multiple data sources and deep learning models, therefore, opening new research op- portunities at the interface between remote sensing, GIScience and computer vision.

1. www.fao.org/3/x3810e/x3810e04.htm

2. www.openstreetmap.org/