PGS Documentation

Page tree

Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Binding sites for the DNA-binding protein of interest are indicated by peaks of enriched sequencing read density in the ChIP sample relative to controls. To identify binding sites, reads must be converted into called called peaks.. How are peaks calculated from reads in Partek Genomics Suite? 

Using the effective fragment length calculated by Cross Strand-Correlation, each read is extended in the 3' direction by the effective fragment length and overlapping extended reads are merged into single peaks. For paired-end reads, the distance between paired reads is used as the fragment length and overlapping fragments are merged into peaks. For peak detection, the genome is divided into windows of a user-defined size and the number of fragments whose mid-points fall within each window is counted. A model for expected read density (a zero-truncated negative binomial) is used to determine which peaks are significantly enriched over a user-defined false discovery rate (FDR). See the ChIP-Seq white paper for more information on the peak-finding algorithm and tips for setting the Fragment extension and window sizes. 

...