Chloroplast genomes contain large (≈25,000 bp) almost perfect inverted repeats (IR). During de novo assembly individual repeats cannot be resolved unless the paired-read insert size is larger than the repeat unit. This means a complete circular plastome cannot be resolved during assembly if using only short-read data. However, the use of paired-read data, combined with identification of the repeat and truncated repeat boundaries, can allow reconstruction of the complete circular plastome.
Geneious contains all of the tools required to do rapid and accurate de novo assembly of chloroplast genomes from short-read NGS data. The NGS data may be derived DNA extracted from purified chloroplasts, or “skimmed” from whole-genome sequence of total DNA derived from chloroplast-rich leaf material. In this poster, we take a short-read NGS data set, available for download from the NCBI Sequence Read Archive (SRA), and describe how to use Geneious to reconstruct a complete, circular, annotated chloroplast genome.