Food products are often composed of multiple ingredients that are in addition generally heavily processed, this makes it very challenging to determine the ingredient composition. Traditional molecular biological techniques, such as, specific PCR followed by Sanger sequencing or TaqMan PCR are most frequently applied to identify species/varieties in food/feed products. In the last decade, next generation sequencing (NGS) technologies have been developed and have been widely applied in medical science and other areas, such as agricultural and environmental sciences. The aim of this thesis was to use detailed genetic differences to identify species/varieties in feed/food products based on advanced analytical NGS based strategies. The study focused on the identification of two target groups: (a) endangered species and (b) GMOs. Elucidating genetic composition was subdivided in three main topics: enrichment, NGS based strategy and identification. For both applications novel molecular assays were developed and coupled to an apt NGS technology, data analysis was performed with a dedicated bioinformatics pipelines that were developed for the specific needs per application.
With respect to endangered species identification, in chapter 2 it was shown that no dedicated method was available to identify endangered plant and animal species in real-life samples. To address this issue, in chapter 3, a multi-locus DNA metabarcoding approach was developed comparing 12 plant and animal barcode and mini-barcode markers, and the method was validated across 16 laboratories. The results showed that the approach was sensitive enough to identify species present at 1% and consistent and reproducible results were observed across the laboratories for all the analysed experimental mixtures and real-life samples. The combination of multiple barcodes enabled the identification of all the species used in the experimental mixtures, and additionally increased the quality assurance for detection. Furthermore, in chapter 4 the applicability of the multi-locus DNA metabarcoding approach was evaluated on 18 traditional medicines (TMs) belonging to different matrices. It was shown that an adequate DNA clean-up system is necessary to remove impurity from real-life samples, in the metabarcoding analysis of the TMs mainly mini-barcode accounted for the identification of the taxa. Regarding to the identified species in the TMs, only a few declared species on the label could be identified across the TMs, however, many undeclared species were identified in the TMs including the endangered species (Ursus arctos). The conclusion for the first part of the thesis was that a combination of universal plant and animal barcode and mini-barcode markers can provide high resolution for species detection, without being limited by matrix, DNA integrity or species composition of a sample.
With respect to the identification of GMOs, the AM-SEQ NGS-based GMOs screening approach was developed and evaluated (chapter 6). The obtained results from the NGS based screening were compared to the currently applied two-step TaqMan PCR based GMO screening. This comparison showed that high abundant targets could be detected similarly, however, low abundant targets could not always detected in one of the two methods. With the use of a broader NGS-based screening strategy more GMOs and related targets could be identified compared to the more limited two-step TaqMan PCR based GMO screening. Additionally, some identified low abundant targets could not be explained, which might indicate the presence of Unknown GMOs (UMGOs) or, alternatively, the donor organism. To identify the unknown sequence of a UGMO a genome walking (GW) approach is necessary, and in chapter 5 the available GW approaches were summarised and from this literature review it was concluded that at that moment no GW method was available to full fill the requirements of UGMOs identification, such as, 0.1% detection limit and enrichment of UGMOs target in a background of GMOs. To address these issues, in chapter 7, Amplification of Linearly-enriched Fragments (ALF) approach was developed and combined with PacBio SMRT NGS technology. The ALF approach was subsequently evaluated on real-life mimicking samples, where sequences related to GMOs present at 1% could be identified. The longest enriched fragment was around 2.5 kbp and a data analysis model was used to distinguish the sequences belonging to known GMOs from the unknown sequences by a sequence of data mapping. With the data analysis model, previous unknown sequence information of a GMO was obtained, showing that the ALF approach can be used to identify the unknown sequence of a UGMO in real-life samples. For the second part of the thesis it was concluded that NGS based GMO screening is an accurate and reliable screening method for GMOs, additionally, the combination of a genome walking approach and NGS is sensitive enough to identify previously unknown sequences for GMO present at low abundance.
In general, it can be concluded that the use of NGS-based screening methods can provide accurate and reliable information on the detailed genetic differences of species/varieties present in complex food/feed products. Using enrichment of known targets both well-known species as well as known and unknown GM sequences could be identified, not limited by the complexity of a sample. The results of this thesis show that NGS-based approaches have the potential to be effectively used for food composition screening, and the developed methods can aid Customs, regulatory agencies, and food industries in monitoring food and feed samples.