Release Notes

Release Notes for RTG Investigator 2.4.1 (2012-02-13)
=====================================================
This is a bugfix release only.
 
* cgmap: The --sex flag was not being correctly obeyed.
 
* sdf2fastq: Fix for incorrect sequence output from SDFs containing
variable length reads.
 
* coverage: Fixed a case where 0 coverage could results in a NaN in
the output file.
 
Previous releases:
 
RTG Investigator 2.4 (2011-11-23)
---------------------------------
 
Major features of this release:
 
* mapx now has support for variable length and reads longer than
189nt. Bear in mind that as mapx currently performs global
alignment, longer reads will be less likely to have a high scoring
match - you may need to adjust alignment thresholds appropriately.
 
* The snp module for calling SNPs, MNPs, and indels now supports
haploid calling, and is faster (almost 2x faster for Complete
Genomics data).
 
* End to end handling of sex chromosomes in human variant
calling. After creating a one-off chromosome specification file for
your reference genome, mapping and variant calling commands allow
you to specify the sex of each individual being processed.
 
* Improved SNP calling accuracy for Ion Torrent, largely as a result
of better handling large indels during initial mapping and
realignment during variant calling.
 
* New somatic variant caller (commercial licensees only). As with the
singleton variant caller, this module is also able to utilize the
chromosome specification file to automatically produce appropriate
haploid/diploid calling on sex chromosomes.
 
* New pedigree-aware family variant caller (commercial licensees
only). This caller performs joint calling of all members of a family
(mother, father, and any number of sons/daughters). This
particularly improves the accuracy of variant calling when coverage
of each individual is low. As with the singleton caller, this module
is also able to utilize the chromosome specification file to
automatically produce appropriate haploid/diploid calls on sex
chromosomes.
 
Changes by command:
 
* family/somatic: These modules now implement complex calling and have
had many other improvements.
 
* family: Now produces a QUAL score.
 
* mapx: Much improved handling of variable length read sets -
previously read sets with more than a few nt deviation in length
were not supported (if attempted, mapping performance would degrade
with shorter reads). Variable length reads is now fully supported.
 
* mapx: Initial support for reads longer than 189nt.
 
* mapx: Handling of the --max-alignment-score for percentage based
thresholds was incorrect in that it was calculated based on the
pre-translated read length. This is now fixed and the flag
description has been updated.
 
* snp: Improvements to Ion Torrent snp calling (as determined by the
read group platform field being set to IONTORRENT).
 
* snp: Added new flag --ploidy to allow specifying whether to perform
haploid or diploid variant calling.
 
* snp: Switched to new internal architecture to more readily allow
multithreading. It no longer has a limit on the number of input SAM
files, however now the input SAM files must be tabixed (or indexed
BAM).
 
* snp: Fixed the handling of calling near boundaries of user-specified
--region locations (previously mappings overlapping the region
border were not being supplied to the snp caller).
 
* snp: CG snp calling speed is approximately 2x faster.
 
* map/snp/somatic/family: Added support for sex specific mapping and
variant calling by defining a reference configuration file and using
the appropriate --sex flag during mapping and snp calling. See the
user manual for more details.
 
* sdfstats: New option --sex to list the reference sequences along
with their ploidy for each sex.
 
* map/snp/somatic/family: Improvements have been made to the
calibration files produced during mapping allow snp calling coverage
filters to handle coverage variations per sequence (e.g. due to
varying ploidy on sex chromosomes). You can generate new calibration
files for existing mappings with the calibrate module.
 
* snp: New VCF filter RCEQUIV denotes when a variant is equivalent to
a previous variant (these typically occur at either end of
homopolymer regions).
 
* snp: New output file regions.bed.gz containing extra information
regarding the calling. Currently it lists the regions that were
called using complex calling.
 
* snp: QUAL scores for extremely confident calls were being capped at
1000000, however this was also including all scores above about
3000. QUAL scores are now more accurately output in the VCF.
 
* The .bz2 decompression library could not handle multi-member
files. This has been extended to support these files.
 
* extract: Bug fix when extracting vcf/coverage from a file containing
a single reference and no region was specified.
 
* extract: Bug fix for when the specified region contained an invalid
range.
 
* all: Updated the bundled JVM to 1.6.0_29
 
* windows: Fixed a problem when RTG was installed to a location
containing spaces in the path name.
---------------------------------
 
Release Notes for RTG Investigator 2.3.2 (2011-10-06)
=====================================================
 
The following are the changes in this release:
 
* format: Added the ability to perform base-quality read trimming,
  using BWA-style "best quality sum" length determination. Trimming
  low quality ends off reads can significantly improve the quality of
  Ion Torrent mappings. E.g. --trim-threshold 15.
 
* map/mapf: Improved mapping defaults for Ion Torrent data.
 
* map/mapf/mapx/format: Added the ability to accept input read data in
  SAM/BAM format, by supplying --format sam-se or --format sam-pe, for
  single or paired-end data respectively. The input SAM/BAM file must
  be sorted by queryname.
 
* mapf: reduced memory usage, particularly with large numbers of
  reference sequences.
 
* mapx: add a warning when the selected parameters will result in a
  large number of indexes, and thus likely to give poor speed.
 
* coverage: fix an exception when encountering third-party SAM records
  with IH attribute set to 0 and NH greater than 0.
 
* sam2bam: this is a new module that specifically converts
  coordinate-sorted SAM to BAM.
 
* sammerge: updated the default behaviour to not perform filtering of
  records marked as unmapped or pcr duplicates (the flag
  --include-unmapped has been replaced by --exclude-unmapped, and the
  flag --include-duplicates has been replaced by --exclude-duplicates)
 
* sammerge: when the output file ends in ".bam", sammerge will produce
  BAM rather than SAM.
 
 
NOTE: When performing snp calling with --region on a partial
chromosome, you should currently enlarge your region by a read length
on each end to ensure all supporting evidence is seen near the
boundaries. This will be addressed in a subsequent release.
 
 
Previous releases:
 
RTG Investigator 2.3.1 (2011-09-12)
-----------------------------------
 
* map/cgmap: SAM flag 0x100 (alignment is secondary) is now set for
  all non-uniquely mapped/mated records.
 
* map/cgmap: SAM flag 0x8 (mate is unmapped) in unmated and unmapped
  SAM files now indicates whether the mate is globally unmapped
  (however, mate position information is not available in these
  records). Previously this flag was always unset in order to avoid
  Picard warnings about not having position information supplied,
  however the SAM spec allows mate position to be unspecified and the
  information in the flag is useful nonetheless. These warnings will
  now be seen if you run the Picard validation tools.
 
* map: fix exception when using --top-random option.
 
* all: allow '=' in sequence names as long as it is not the first
  character.
 
 
RTG Investigator 2.3 (2011-08-31)
---------------------------------
 
* cgmap: switch to a new aligner implementation that produces better
  alignments and results in a 20-30% improvement in execution
  time. The SAM extended attributes GC/GS/GQ containing CG specific
  information have been replaced by more expressive attributes
  XU/XR/XQ. See the user manual for more details.
 
* map/snp: Initial Ion Torrent support. Specifying the IONTORRENT
  platform in the read group information during mapping will alter
  default alignment penalties and thresholds to better handle the Ion
  Torrent indels and will propagate through variant calling.
 
* snp: ambiguity ratio (AR) and allele balance (AB) have been added to
  FORMAT output in VCF. Calls that are made using the complex
  realigning caller are now indicated as such with an XRX annotation.
 
* snp: summary statistics have been updated to contain more useful
  information in a more readable presentation.
 
* snp: removed --output-second flag which was a hangover from a
  previous output format and did not affect the VCF produced.
 
* many commands: now support reading .bz2 compressed FASTA/FASTQ
  files.
 
* mapx: now supports direct loading of reads from FASTA/FASTQ.
 
* coverage/species: now includes sequence lengths in output.
 
* coverage: produces additional coverage information regarding non-N
  regions.
 
* map/mapf: performance and memory improvements when mapping against
  very large numbers of reference sequences.
 
* map/cgmap/mapf: the value supplied to the --sam-rg flag may now be
  either the name of a file containing the read group information, or
  a string containing the read group information itself (tabs must be
  represented by the sequence \t rather than literal tab characters,
  see the documentation for more information).
 
* sdfsplit: uses a disk-based SDF reader by default and have added the
  --in-memory flag to enable the older method (for faster processing
  if sufficient RAM is available).
 
* format: added the --allow-duplicate-names flag to disable the
  duplicate sequence name detection (this can save large amounts of
  memory when formatting extremely large datasets).
 
* sdfsplit: renamed the --disable-dupe-detection flag to
  --allow-duplicate-names for consistency with format.
 
* rtg wrapper script: rtg and the java that gets invoked now share the
  same unix process group so that signal handling works as expected
  within cluster scenarios.
 
 
RTG Investigator 2.2.1 (2011-07-14)
-----------------------------------
 
* mapx: fixed an overflow problem when the number of reads times the
  --max-top-results setting exceeded Integer.MAX_VALUE (2^31-1).
 
* rtg wrapper script: added safety checks for malformed cfg files (for
  example, it is easy to forget to include quotes when a property
  needs spaces). Also, the default rtg.cfg sets RTG_JAVA_OPTS to
  disable the JVM use of the popcount instruction until Oracle bug
  #7063674 is fixed.
 
* many commands: included a workaround for a bug in gzip decompression
  that is present in many recent versions of the JRE. This allows us
  to include a no-JRE distributable, so we can now officially support
  MacOSX as a platform.
 
* EULA: permit investigators to use for evaluation; registration
  overview; non-competitive use only.
 
* snp/coverage: when supplying lists of SAM files via
  --input-list-file, the list files are now tolerant of extra
  whitespace surrounding the filenames and empty lines. Lines starting
  with the hash character '#' are now treated as comments and are
  ignored.
 
* map/cgmap: RTG mated SAM files contain records in pairs, but in very
  heavy repeat regions this would occasionally be violated and the
  resulting SAM file would contain a SAM record for one arm but not
  the other. This is now fixed.
 
* map/cgmap/mapf: Fixed rare crash that could occur when running
  map/cgmap with --all-hits option, or mapf.
 
 
RTG Investigator 2.2 (2011-06-08)
---------------------------------
 
Initial public release.
 
NOTE: Non-deterministic mapping results have been observed on modern
      CPUs with Java versions 1.6.0_18 and newer due to a bug in the
      use of the popcount instruction. If your CPU implements SSE4
      instructions, we recommend adding -XX:-UsePopCountInstruction to
      the RTG_JAVA_OPTS configuration setting to work around this. We
      have filed a bug with Oracle regarding this
      (http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=7063674) but
      there is currently no resolution.