Proteomic and genomic data mining with applications in plant science
The quest for analyzing ever-growing volumes of molecular biological data has been a prominent endeavour for bioinformatics for decades. This thesis is dedicated to developing bioinformatics approaches to analyze and manage large-scale genome, proteome and variome data for functional genomics and population genetics studies in plant science. I start from the mining and comparative genomics analysis of proteome and genome data across multiple species, and explore the biological significance of amino acid repeats widely spread in protein sequences across different life kingdoms. Using the sorghum genomics data, I apply multiple bioinformatics approaches, combined with population genetics and experimental methods, to study the sorghum key gene for a vital breeding trait, disclosing crucial genes' functional and evolutionary roles during sorghum domestication and improvement.