Reduce administrative pressure on farmers with AI

- dr. HCJ (Hans) Vrolijk
- Head Statutory Tasks Unit Economic Information
With farmers having to manage more and more data, the Centre for Economic Information (CEI) is investigating whether AI could support them. A digital assistant that can extract data from existing invoices could ease their administration burden. The first pilot is promising.
Data collectors from the Centre for Economic Information (CEI) often simply sit at the farmer's kitchen table going through paper records. “Every farm is different,” says Hans Vrolijk, until recently the head of the centre, which is part of Wageningen Social & Economic Research. “Some businesses file everything in a ring binder. In others, we can work with data from their accounting firm.” The CEI collects data from 1,500 reference companies. Together, these are representative of the entire Dutch agriculture and horticulture sector.
That data is not just about their finances. The CEI also collects data on plant protection products, antibiotic use or how much manure is applied, for example. “The Dutch government and the EU use our reports to evaluate their agricultural policies, including sustainability. Environmental performance such as reducing greenhouse gas emissions or nutrient surplus is becoming increasingly important, as is the question of whether sustainable farming pays off.” This could well increase the administrative pressure for farmers.

In an initial pilot led by Vrolijk, the CEI therefore investigated whether artificial intelligence could offer a solution. Can an AI assistant learn to read invoices, collect data and label them correctly? The answer: increasingly so.
From raw data to indicators
Training an AI requires data. In fact, the CEI is always looking for suitable and reliable data streams. The centre converts these into relevant indicators for the state of the Dutch agricultural and horticultural sectors, the fishing industry and forestry, such as farm yields, manure production or antibiotic use. To calculate family income from the agricultural sector, the centre looks at revenues, expenses and labour hours, for example. An indication of the use of plant protection products can be derived from the invoices of the suppliers of those products. The CEI calculates and breaks down the data by farm type and size.
“Invoices don’t just contain financial information. They record what enters and leaves the business.”
- Naam
- Vrij in te vullen veld voor functieomschrijving
Farms are increasingly sharing data, and not just with the Dutch government. The EU requires sustainability reports, and quality labels also want to know how much impact businesses have on the environment. At the same time, an increasing part of farm administration is becoming digital. Invoices are sent by email and, increasingly, those invoices are stored centrally. “That's where we saw an opportunity,” says Vrolijk. Couldn't a digital data collector unlock the information from those invoices? That relieves farmers of the administrative burden and for the CEI it also means a step forward in efficiency.”
Learning to read invoices
“Invoices don’t just contain financial information,” Vrolijk explains. “They record what enters and leaves the business: products sold, waste disposed of, medicines purchased. So that rich information flow already exists, and it is very useful. No one needs to do extra work to create these documents.” Invoices are therefore the preferred source to train an AI with. During the development, the researchers worked with e-invoices, PDFs or even scans of paper documents. “Those can all be made sufficiently readable.”

The model had to learn to perform two tasks: recognise the data on the invoice and then also label it correctly. “Not everything on an invoice is relevant, such as the address of the supplier or the names of the sender. That was the first thing the AI had to learn: to distinguish relevant information from the rest of the document. It then also needs to categorise that information. Is this financial data, is it related to crop protection, or is this about fertilisation?'
Interpreting information
It was that second step, correctly labelling the recognised information, that the researchers had to explicitly train the model for. “Finding the product rules works very well, but interpreting that information can be trickier,” Vrolijk says. Take an invoice for the disposal of tomato plants, for example. “That is not a sale, but a waste stream. We paid extra attention to this in the training process, because AI did not naturally make that distinction. After all, there are also tomatoes on the invoice.”
There were also practical challenges. Invoices are not standardised and every company displays relevant information differently. “Suppose a farmer has purchased 30 litres of plant protection product. The invoice might only show six units of five litres. In such a case, the model itself still needs to do the arithmetic, and possibly also convert the total to the standard unit of the central system.”
“Any time when there is another specialist checking is also feedback with which the model can improve itself”
With stricter sample sets, sufficient built-in checks, and by still submitting doubtful cases to a human data collector, the likelihood of errors can be increasingly reduced. “Any time when there is another specialist checking is also feedback with which the model can improve itself.”
Working more efficiently
How soon will farmers reap the benefits of a digital data collector? Vrolijk cannot yet go into details about that. “Particularly because it also depends on how you would want to implement this technology. For example, the model could already be connected to the data flow at the suppliers, who prepare the invoices. You can also make it a central service, where farmers upload their own documents so that they can be processed. Those solution directions each deserve their own investigation.”
For the centre itself, the situation is different. “If we implement the model in its current form, we could already start working more efficiently internally,” argues Vrolijk. “Compared to the commercial software we currently use to extract information from invoices, I think the digital data collector is actually more promising. Within one or two years, I therefore expect the CEI to be working with artificial intelligence to bring out the facts faster and purer.”
Contact
Contact our expert
dr. HCJ (Hans) Vrolijk
Head Statutory Tasks Unit Economic Information