Complete Genomics Analysis

Complete Genomics sequence alignment and variant detection features in RTG Investigator enable variant analysis with DNA sequence from whole human genomes.  RTG Investigator processes sequence data from Complete Genomics as a standard read type, allowing full integration into the variant detection pipeline, which enables independent validation and assessment of Complete Genomics sequence data.  RTG Investigator is the only means for a researcher to yield additional results from Complete Genomics data, using in-house compute and bioinformatics resources. With RTG Investigator, you can extend the value of your Complete Genomics data investment.

Expand an investment in Complete Genomics

Integrating Complete Genomics data

Complete Genomics data Complete Genomics variably gapped read structure provides a unique and hard challenge when it comes to mapping and alignment. The RTG engine was adapted to handle the special Complete Genomics read structure directly, and has been tuned to provide fast accurate mappings. 

Outputs from the RTG Investigator Complete Genomics mapper are in the same format as for other platforms, meaning that the same downstream processing tools can be applied. RTG Investigator allows great flexibility in the way Complete Genomics reads can be processed and analyzed, which leads to novel research possibilities that cannot be obtained from the outputs provided by Complete Genomics alone.

Within the RTG Investigator variant detection framework, Complete Genomics reads are treated like reads from any platform and are seamlessly integrated and processed within variant detection pipelines. This indifference to read type allows cross-platform experimentation all within a single variant detection pipeline.

Multi-platform SNP calling

Combined Complete Genomics Illumina SNP call accuracyCombining reads from different sequence technologies delivers higher variant call accuracy. The RTG Investigator snp command applies platform-specific base quality information from Complete Genomics and other sequencing platforms to improve variant call scoring and genotyping accuracy.

In an experiment with the reads from Complete Genomics and Illumina sequencing platforms for the 1000 Genome Project’s Yoruba Trio NA19240 sample, SNPs were called using each platforms reads independently and also by combining the reads from both. The ROC plot clearly shows how the quality of the SNP calls improves when combining the reads from both platforms. In the curves for each platform SNP calls are ordered on their Bayesian posterior score, with higher scores toward the lower-left. True SNP call counts increase vertically and false call counts increase horizontally.

Variant call intersections

The RTG Investigator variant call pipeline provides an independent set of variant calls, on the same set of reads, than those provided by Complete Genomics. Comparing and contrasting the variants called by both methods increases confidence in those calls that intersect with high concordance, and also provides greater evidence for locations where more investigation is needed. 

RTG Investigator calls novel variants and also "rescues" variants not found in the Complete Genomics outputs.  

 

Through their Early Access Program (EAP), Real Time Genomics enabled VIB to independently assess and validate the results from Complete Genomics.  Looking forward, RTG software gives us the flexibility to perform novel bioinformatics analysis with existing and new Complete Genomics read data.

- Diether Lambrecht, Group leader of Complex Genetics