Oral Annual Conference of the Genetics Society of Australasia with the NZ Society for Biochemistry & Molecular Biology

Genetic hitchhikers: what species are hiding in your sequencing data? (711)

Rachael Ashby 1 2 , Hayley Baird 1 , Rudiger Brauning 1 , Alan McCulloch 1 , Chris Brown 3 , Shannon Clarke 1 , Neil Gemmell 2
  1. AgResearch, Invermay
  2. Department of Anatomy, University of Otago, Dunedin
  3. Department of Biochemistry, University of Otago, Dunedin

High throughput sequencing is now routinely used for the generation of draft genomes, transcriptomes and high-density marker panels for non-model organisms. The demand for next generation sequencing (NGS) data in non-model species has also diversified the type of samples being sequenced, with wild samples and whole samples routinely collected, extracted and sequenced. The result is that vast quantities of data are being generated for species we know little about, often without any strong reference genome for comparison. Unfortunately, DNA extraction and NGS are not specific for the target organism; low levels of bacteria, virus and human genetic material are frequently identified and filtered through optimized bioinformatics pipelines. However in addition to these contaminants, genetic hitchhikers such as parasites, endophytes and commensal species can be sequenced along with the target species. These hitchhikers are often ignored or viewed as contamination, but we contend that this data could offer valuable insight into the target species and its environment. We support this view using data from the Greenshell™ Mussel, an endemic species of economic importance to the New Zealand aquaculture industry. We identify hitchhiker species present in the transcriptome and GBS data of this filter feeding species. We also show that these data, analysed appropriately, provide valuable insights into the mussel’s biology. We argue that data from non-target organisms, once identified, should not always be discarded or ignored as contamination, and examine the potential applications for this data in other systems such as plant endophytes and insect parasites.