PGS Documentation

Page tree

Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Numbered figure captions
SubtitleTextPeaks spreadsheet lists regions with significant peak enrichment with one row per region.
AnchorNameResults of Peak Detection

A few of the columns contents merit clarification. 

 This spreadsheet is sorted by chromosome number and genomic location. Each row represents one genomic region of peak enrichment. The columns are:

Column 1. Chromosome gives the chromosome location of region

Column 2. Start gives the start of region (inclusive)

Column 3. Stop gives the end of region (exclusive)

Column 4. Sample ID gives the sample containing the enriched region

Column 5. Interval length gives the length of the region, Start - Stop, in base pairs

Column 6. Maximum Extended Reads in Window gives the greatest number of extended reads in any of the windows of a region

Column 7. Reads per Million (RPM) divides the total number of aligned reads in the sample (in millions). This helps you compare peaks across samples, especially when there is a large difference in the number of aligned reads between samples.

Column 8. Mann-Whitney p-value identifies the separation between forward and reverse peaks for single-end reads using the Mann-Whitney U-test. Lower p-values indicate better separation. This p-value can be used if there was no control sample or to eliminate regions called due to PCR bias. 

Columns 9-10. Total reads in region gives the total number of non-extended reads for each sample in the given genomic region. One column for each sample. 

Column 11. p-value(Sample ID. vs. mock) compares the sample specified in column 4 to the reference sample for this genomic region using a one-tailed binomial test. A low p-value means there are significantly more reads in the sample specified in column 4 than in the mock sample. This column is only included if a reference sample is specified. 

Column 12. scaled fold change (Sample ID vs. mock) compares intensity of signal between the sample specified in column 4 to the reference sample in the given genomic region. The fold-change is scaled by a ratio of the number of reads for each sample (IP vs. control) on a per-chromosome basis. Scaled fold changes >1 indicate more enrichment in the IP-sample than in the control sample. This column is only included if a reference sample is specified. 

Columns 13 -14. <Sample ID> overlap percent gives the fraction of the given genomic region that overlaps a called peak region from the indicated sample. For example, the values of 100% in column 13 and 0% in column 14 indicate regions detected in the chip sample, but not in the mock sample. Similarly, regions with the value of 100% in column 14 were detected in the mock sample. 

 

Additional assistance

 

Rate Macro
allowUsersfalse