Sequence Alignment Software

DNA sequence alignment tools in RTG Investigator sequence analysis software offer a significant speed and accuracy advantage in short read mapping and sequence database search relative to other sequence alignment programs.  The greater alignment sensitivity of RTG delivers higher accuracy in the results reported from downstream variant and metagenomic analysis pipelines.

At default settings, RTG accurately maps 98% of reads to the reference in the same time BWA maps 93%. A read mapping benchmark test compares speed and sensitivity for the RTG map command relative to BWA with 25M real human 108bp reads from 1000 Genomes Project. Word and step size choices were used in a parameter sweep to highlight the tradeoff between speed and sensitivity.

Accurate short read alignment
without sacrificing speed

 

 

Short Read Alignment Sensitivity

Higher tolerance to indels, errors and mutations during sequence alignment produces more accurate data for downstream analysis. In tests with simulated 100bp short reads over a typical range of error rates, RTG correctly mapped over 99% of reads at 0.5% and 1.0%. At 2.0% error the mapping percentage was still over 97%, and at 5.0% error, over 94%. This extra sensitivity can be used in novel studies with emerging sequencing platforms or cross-species mapping.

  Gapped Alignment Sensitivity

Read Mapping Speed

RTG's unique hash table index structure outperforms even the fastest sequence alignment algorithms built with FM structures (BWA and Bowtie). When tested with 108bp read data from the 1000 Genomes Project, the RTG map command was shown to run 6.7x faster than BWA when tuned to map 93% of reads with accuracy.  Conversely, this speed allows short read mapping with greater sensitivity in the same amount of time.

  Read Mapping Speed

Next Generation Sequencing Platforms

RTG sequence alignment software supports NGS data analysis with short read data from the major next generation sequencing platforms, including Illumina, Complete Genomics, Ion Torrent, and Roche 454. It imports data in FASTA, FASTQ or Complete Genomics formats.  Base quality recalibration is computed during mapping for downstream analyses. 

  Next Generation Sequencing Platforms