Thesis subject

Comparative transcriptomics between species

For over 15 years, researchers have measured mRNA levels in a wide variety of organisms, initially using microarrays but these days increasingly by directly measuring mRNA molecules through next-generation sequencing (NGS). As the genomes of more and more (closely) related organisms are becoming available, in many areas the question arises whether the transcriptomes of different organisms can be meaningfully compared [1].

In this project, the goal is to develop tools to functionally interpret the commonalities and differences between the transcriptomes of various organisms. Such tools would depend on relating the protein complement in the various organisms through homology, developing quantitation methods to compare transcript levels and annotating gene sets based on shared functionality, e.g. co-occurrence in metabolic or signaling pathways or shared gene ontology annotation. As individual transcriptome data sets are often obtained using different measurement devices, normalization is a key concern. Datasets of closely related species (4 tomato accessions) and more dissimilar organisms (5 prokaryotic/eukaryotic industrial microorganisms) are available to develop the methods. The desired outcome is a method for comparative transcriptomics between any set of organisms.

[1] P. Golby et al. (2007) Comparative transcriptomics reveals key gene expression differences between the human and bovine pathogens of the Mycobacterium tuberculosis complex. Microbiology 153(10): 3323-36. [2] D. Koenig et al. (20130) Comparative transcriptomics reveals patterns of selection in domesticated and wild tomato. PNAS 110(28):E2655-62.

Used skills: Programming, statistics.

Requirements: INF-22306 Programming in Python, BIF-30806 Advanced bioinformatics, MAT-20306 Advanced statistics or BRD-31806 Parameter estimation and model structure ident. or ABG-30806 Modern statistics for the life sciences.