A new data infrastructure to collect high-quality methane sensor data from commercial farms (near) real-time
This showcase links to a large field study to develop methane emission mitigation strategies for the Dutch dairy industry.
Problem statement / motivation
To develop methane mitigation strategies, several sensors measuring methane are installed on multiple dairy farms. One of these sensors is deployed to monitor methane at cow level. The usual procedure to collect these methane data is to connect the methane sensor to a data logger. Data are collected from this logger once a week, through a WIFI connection.
This is however not the most optimal method of data collection, as it impairs large-scale implementation, and thus monitoring of methane emission on many dairy farms. Moreover, since data are collected weekly, and checked for their quality at an even less frequent basis, the risk increases that technical failures are detected (too) late with unnecessary and unwanted data loss as a result. Solving these two issues will enable large-scale monitoring of methane emissions, while keeping data loss to a minimum.
Solution / approach
To enable large-scale collection of methane data from commercial dairy farms, we set up a flexible data infrastructure. The infrastructure is flexible because it is easy to add more farms, but also more sensors if desired. It involved, at five farms, the connection of the methane sensor to an Arduino, which was programmed to send small packages of data (every three minutes) to the cloud (of Microsoft Azure) through an Internet-of-things (IoT)-connection.
As soon as data arrived at Microsoft Azure, Streaming Analytics was programmed to unpack the data, indentify the Arduino (and therewith, the farm), and to translate sensor data (measured in voltages) into methane concentrations. Furthermore, Streaming Analytics was programmed to perform a data quality check as soon as the data arrive in Azure: for example if packages from the Arduino were empty, or if data were outside pre-defined ranges, and e-mail was sent to the researcher reporting the abnormalities. Data were also visualised (near) real-time by PowerBI.
The use of Arduino’s to collect and send data appeared to be extremely easy. Anomalies in the data were detected instantly and e-mails were generated and sent to the responsible researcher, resulting in technical errors being quickly resolved.
(Expected) impact of the approach
With this flexible infrastructure we are able to collect methane data (near) real-time, independent from WIFI. This enables to collect data from virtually all dairy farms in the Netherlands (and outside). Moreover, collection of data (near) real-time allows for new options to analyse data. The data quality check and visualisation of data allows for quick detection of data anomalies. The set-up of the infrastructure was communicated during the Scientific symposium FAIR data for Green Life Sciences (Wageningen, the Netherlands) in December 2018.
Next step for this study is to use the flexibility of the infrastructure. This includes the collection of methane data at more farms, but also the collection of data from other methane sensors than just one. It would be of high interest to extend the infrastructure and allow for direct collection of cowID. Next steps will also involve the visualisation of (near) real-time methane data for different groups of end-users (e.g., policy makers, farmers, or advisors).