Partek Flow Documentation

Page tree

Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Setting up the task (Figure 1) involves identifying the Genome Build used for variant detection and the Sample to validate within the project.  Target specific regions allows for specification of the Target regions for this study, relating to the regions sequenced for all samples in the project.  Benchmark target regions represents the regions that have been previously interrogated to identify “gold standard” variant calls in the sample of interest.  These parameters are important to ensure that only overlapping regions are compared, avoiding the identification of false positives or false negative variants in regions covered by only the project sample or the “gold standard” sample.  Both sections utilize a Gene/feature annotation file, which can be previously associated with Partek® Flow® via Library File Management or added on the fly. The Validated variants file is a single sample vcf file containing the “gold standard” variant calls for the sample of interest, and can be previously associated with Partek® Flow® as a Variant Annotation Database via Library File Management or added on the fly.

 

Numbered figure captions
SubtitleTextTask dialog for Validate variants
AnchorNamevalidate variants task dialog

...

  • No genotypes: the number of missing genotypes from the sample in the Flow project
  • Same as reference: the number of homozygous reference genotypes from the sample in the Flow project 
  • True positives: the number of variant genotypes from the sample in the Flow project that match the validated variants file
  • False positives: the number of variant genotypes from the sample in the Flow project that are not found in the validated variants file
  • True negatives: the number of loci that do not have variant genotypes in the sample in the Flow project and the validated variants file
  • False negatives: the number of genotypes that do not have variant genotypes in the sample in the Flow project but do have variant genotypes in the validated variants file
  • Sensitivity: the proportiion of variant genotypes in the validated variants file that are correctly identified in the sample in the Flow project (true positive rate) 
  • Specificity: the proportion of non-variant loci in the validated variants file that are non-variant in the sample in the Flow project (true negative rate)
  • Precision: the number of true positive calls divided by the number of all variant genotypes called in the the sample in the Flow project (positive predictive value), 
  • F-measure: a measure of the accuracy of the calling in the Flow pipeline relative to the validated variants.It considers both the precision and the recall of the test to compute the score. The best value at 1 (perfect precision and recall) and worst at 0.
  • Matthews correlation: a measure of the quality of classification, taking into account true and false positives and negativesThe Matthews correlation is a correlation coefficient between the observed and predicted classifications, ranging from −1 and +1. A coefficient of +1 represents a perfect prediction, 0 no better than random prediction and −1 indicates completely wrong prediction.
  • Transitions: variant allele interchanges of purines or pyrimidines in the sample in the Flow project relative to the reference
  • Transversions: variant allele interchanges of purines to/from pyrimidines in the sample in the Flow project relative to the reference
  • Ti/Tv ratio: ratio of transition to transverstions in the sample in the Flow project
  • Heterozygous/Homozygous ratio: the ratio of heterzyous and homozgous genotypes in the sample in the Flow project
  • Percentage of sites with depth < 5: the percentage of variant genotypes in the sample in the Flow project that have fewer than 5 supporting reads
  • Depth, 5th percentile:  5% of sequencing depth found across all variant genotypes in the sample in the Flow project
  • Depth, 50th percentile: 50% of sequencing depth found across all variant genotypes in the sample in the Flow project
  • Depth, 95th percentile: 95% of sequencing depth found across all variant genotypes in the sample in the Flow project

...