Noncoding RNA (ncRNA) genes have been shown to have critical roles across all domains of life. We are interested in the structure and function of diverse ncRNA genes. These studies have included the computational discovery and accurate prediction of ncRNA genes, the RNAs they produce, the functions of these RNAs, and their evolution over short and long timeframes.
In a few model organisms many ncRNAs have been annotated on the genome sequence. However, for most genomes only a few types of RNAs (e.g. tRNA, rRNA) are automatically annotated by current pipelines (e.g. NCBI, JGI, Ensembl) with a few other classes inconsistently and poorly described. Major challenges in predicting ncRNA genes are that functional parts may be small and the evolutionary conservation is more difficult to detect than protein coding genes.
Most archaea and about half of bacteria use a recently discovered adaptive immune system that is ncRNA based. This CRISPR-Cas (Clustered regularly interspaced short palindromic repeats, and CRISPR associated proteins) system likely evolved as a defense against foreign genetic elements (notably viruses or bacteriophages). The CRISPR-Cas9 system has recently been repurposed as a precision genetic targeting tool in many organisms. We have developed a suite of tools to investigate the CRISPR adaptive immune system of prokaryotes (CRISPRSuite). The bioinformatic challenges of developing these tools and applying them to ~40,000 prokaryotic genomes will be described. Insights gained into the distribution, structure, function and and evolution of these systems will be presented.