A computational approach to study the transcriptional response of enterocytes exposed to luminal factors

Venkatasubramanian, Prashanna Balaji


The connection between food and human health needs no introduction, but apart from food, there are multiple other luminal factors that alone or in combination with others can impact on enterocyte function and thus play a role in health. These interactions maybe at different levels, but primarily at the molecular level. The intestinal epithelial cell line Caco-2 has been used extensively as a model to investigate molecular interactions between different luminal factors and enterocytes. More recently attention has been given to interactions of more complex food extracts with the aim of gaining more insights into the combined effects of food components, as would occur in vivo.

In chapter 1, The importance of studying enterocytes, food interactions and the Caco-2 as an enterocyte model system is discussed. Previous research on Clostridium difficile toxins and their interaction with enterocytes as well as systems biology and related approaches are described in detail. The chapter ends with a brief introduction on the key research questions and the content of the chapters which follow.

In this thesis, we have generated a data compendium by collecting transcriptome data (largely from microarray experiments) pertaining to Caco-2 cells exposed to luminal factors from in- house experiments and public databases (Chapter 2). Initially, the data compendium was used to develop a Caco-2 specific protein-protein interaction network. Then, we addressed the issue of identifying pathway specific reporter genes for qPCR experimental validation. To this end, we developed a statistical method called differential expression correlation analysis (DECA), which is designed to mine knowledge from the Caco-2 data compendium. The method utilises differential expression values of genes in the compendium combined with limited knowledge of pathways of interest to identify reporter genes. This method was further used to predict genes that belonged to AhR and Nrf2 mediated stress response pathways and was experimentally validated using Caco-2 cells exposed to coffee extracts.

In Chapter 4, the Caco-2 data compendium was utilized for identification of food substances that may mitigate the effects of C. difficile toxins on small intestinal enterocytes. This was combined with Caco-2 microarray data obtained from Caco-2 cells exposed to Clostridium difficile toxins (toxin A and toxin B) and toxoids. The identification of possibly beneficial foods was carried out using multivariate techniques such as principal component analysis (PCA). Blackcurrant of Ben Finlay cultivar was found to be the most beneficial food among the food substances used in the compendium and was experimentally verified. It was found to help maintain the epithelial barrier and also in preventing the translocation of C. difficile toxins from the apical side to the basolateral side. Additionally, we also tested the efficacy of strawberry (Sabrina), yellow onion, white onion and Galacto-oligosaccharides (GOS) and found, in accordance with PCA results, that while strawberry and yellow onion were moderately effective against the toxin translocation, white onion and GOS had almost no effect.

We delved further into an investigation of the impact of Clostridium difficile toxins on the miRNA expression of the colonic enterocytes and probed the role of miRNAs in regulating toxin-induced changes in mRNA expression (Chapter 3). miRNA-mRNA interaction was studied with the help of public database, miRWalk 2.0 and the network analysis tool, Cytoscape. We performed pathway analysis with the data obtained and found a role for miRNAs in several pathways that are affected by Clostridium difficile toxins in Caco-2 cells.

Finally, to enable data fusion of experiments that have low and varying sample sizes, we developed a batch effect mitigation protocol (Chapter 5). The method is a combination of ratio- based methods and median rank scores. The method was tested on the controls-only data derived from the Caco-2 compendium and was shown to mitigate batch effects. It was further tested on arthritis patient sample data, applied to the Caco-2 compendium data and shown to be efficient at batch effect mitigation.

In chapter 6, the results of the thesis are discussed including the limitations and future perspectives for the advancements in the field of foodomics.