Detecting spatial-temporal events based on Twitter data using Named Entity Recognition

Organisator Laboratory of Geo-information Science and Remote Sensing

wo 9 april 2014 13:30 tot 14:00

Locatie Gaia, building number 101
Droevendaalsesteeg 3
6708 PB Wageningen
+31 317 48 16 00
Zaal/kamer 1

by Yue Liu (China)


As becoming popular in our daily life, social network services provide a huge collection of volunteer data which can contribute to different kinds of research and inspire the scientists to think about achieving data in new perspective. In this thesis, an experiment was executed to apply social network data on the field of Geo-information sciences.

This thesis aimed to detect Spatial-Temporal Events from people’s tweeting behaviour. A raw tweets dataset, which contained the text and time of about fifteen million tweets, was caught from Twitter streaming as the data resource of this research. Different from traditional geo-targeting methods, the spatial factor of tweets was achieved by Named Entity Recognition and Geo-coding. The Stanford Named Entity Recognizer model was used to extract location names from text of tweets, which was proven to provide enough geography information for detecting Spatial-Temporal Events. The Geo-coding was then applied to order location names and hierarchize them with administrative levels. This thesis pointed out possible inaccuracy produced by Stanford Named Entity Recognizer model, yet the validation was not applied since the limit of time and human resource.

The tweets, after location name extracting and geo-coding, entered a time-series decomposition and seasonal adjustment model based on local regression, then five patterns of interests were detected by manually checking with assisting from statistical method. Three Spatial-Temporal Events were detected from these five patterns. The online news was used to prove that these three Spatial-Temporal Events were correctly existed at the time and location we detected. As an experiment, this thesis successfully detected Spatial-Temporal Events from raw tweets data and pointed out the way of developing this topic in future.

Keywords: Stanford Named Entity Recognizer; Detecting Spatial-Temporal Events; Time-series decomposing; Tweets data.