Partek Flow Documentation

Page tree

Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

When a homozygous genotype would be observed, you would expect to observe nearly 100% of the homozygous allele. When the base frequencies of a heterozygous base are are examined, you expected to observe nearly 50% for each allele. The observed base frequencies may deviate slightly from these numbers because call and alignment errors.

An example of expected allele probabilities using for an error probability of .01 is given below. If an error occurred (caused by base calling or mapping), assume each of the 4 alleles are equally likely to be observed with probability Perror (including alleles compatible with the genotype). Phom is the expected probability of observing the allele matching a homozygous genotype. Phet is the probability of observing each of the two alleles of a heterozygous genotype.
Perror = .01 / 4
Phom= 1.0 – 3 * Perror
Phet= .5 – Perror
The likelihood of a homozygous genotype AA given an observed base frequency F = {FA, Fc,FG,FT} can be expressed as:
L(AA | F, Perror) = PhomFA * Perror(FC + FG + FT)
The likelihood of a heterozygous genotype CT given an observed base frequency F can be expressed as:
L(CT | F, Perror) = Phet(FC + FT* Perror(FA + FG)
The genotype, G, is assigned using maximum likelihood, and a log (base 10) odds ratio is calculated to aid in sorting.
Gmax = argsmax {L(G | F, Perrorr)}
Log Odds = log ( L(Gmax | F,Perrorr) / (1.0 – L(Gmax | F, Perror) )
If the Log Odds are undefined because of machine numeric representation limitations, then the log odds are capped at 106.