PGS Documentation

Page tree

Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Regions that contain a binding site for the DNA-binding protein of interest will be indicated by enriched sequencing read density. Because single-end reads each only cover one end of a larger sequence fragment, enriched regions will generally show two adjacent forward and reverse strand peaks. Using the effective fragment length calculated using To merge these peaks, each read is extending in the 3' direction by the effective fragment length, converting reads into estimated fragments. Overlapping estimated fragments are then merged into peaks. For peak detection, the genome is divided into bins of a user-defined size and the number of estimated fragments that fall in each bin is calculated. A zero-truncated negative binomial is fitted to the bin counts and all regions that are enriched above a user-defined false discovery rate (FDR) are called as peaks. 

 

 Using the effective fragment length calculated by Cross Strand-Correlation, Partek Genomics Suite will extend each read in the 3' direction by the effective fragment length and merge overlapping extended reads into a single peak. For paired-end reads, the distance between paired reads is used as the fragment length and overlapping fragments are merged into peaks. For peak detection, Partek Genomics Suite divides the genome into windows (bins) of a user-defined size and counts the number of fragments that fall within each bin. Partek Genomics Suite fits a zero-truncated negative binomial to the bin counts and finds all regions that are above a user-defined false discovery rate (FDR). See the ChIP-Seq white paper for more information on the peak-finding algorithm and tips for setting the Fragment extension and window sizes. 

...