Analysis of automatic image classification methods for Urticaceae pollen classification

Li, Chen; Polling, Marcel; Cao, Lu; Gravendeel, Barbara; Verbeek, Fons J.


Pollen classification is considered an important task in palynology. In the Netherlands, two genera of the Urticaceae family, named Parietaria and Urtica, have high morphological similarities but induce allergy at a very different level. Therefore, distinction between these two genera is very important. Within this group, the pollen of Urtica membranacea is the only species that can be recognized easily under the microscope. For the research presented in this study, we built a dataset from 6472 pollen images and our aim was to find the best possible classifier on this dataset by analysing different classification methods, both machine learning and deep learning-based methods. For machine learning-based methods, we measured both texture and moment features based on images from the pollen grains. Varied feature selection techniques, classifiers as well as a hierarchical strategy were implemented for pollen classification. For deep learning-based methods, we compared the performance of six popular Convolutional Neural Networks: AlexNet, VGG16, VGG19, MobileNet V1, MobileNet V2 and ResNet50. Results show that compared with flat classification models, a hierarchical strategy yielded the highest accuracy with 94.5% among machine learning-based methods. Among deep learning-based methods, ResNet50 achieved an accuracy of 99.4%, slightly outperforming the other neural networks investigated. In addition, we investigated the influence on performance by changing the size of image datasets to 1000 and 500 images, respectively. Results demonstrated that on smaller datasets, ResNet50 still achieved the best classification performance. An ablation study was implemented to help understanding why the deep learning-based methods outperformed the other models investigated. Using Urticaceae pollen as an example, our research provides a strategy of selecting a classification model for pollen datasets with highly similar pollen grains to support palynologists and could potentially be applied to other image classification tasks.