Partek Flow Documentation

Page tree

Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

In ChIP-seq or ATAC-seq analysis, a major challenge after peak regions detected, quantify regions will report the read count from the peaks for each sample. The quantify regions algorithm is the same as  Quantify to annotation model (Partek E/M), the annotation model is location of the peaks.

Quantify regions dialog

Click on peak data node, choose Quantify regions task in Quantification section. (Figure 1). In peak detection task, it report a set of peak regions in each sample individually. This task merged all the peaksets across all the samples to create a unique region list as a .bed file, in other words, detecting enriched regions or peaks is to compare samples and identify differentially enriched regions. In order to compare samples, a common set of regions must be identified and the number of reads mapping to each region must to quantified. The Quantify regions task addresses this challenge by generating a union set of unique regions and reporting the number of reads from each sample mapping to each region. 

To run Quantify regions:

  • Click a Peaks data node
  • Click the Quantification section in the toolbox
  • Click Quantify regions 

Quantify regions method

The Quantify regions task takes MACS2 results as input. In a typical ATAC-Seq or ChIP-Seq analysis, MACS2 is configured to output a set of enriched regions or peaks for each experimental sample or group individually. Quantify regions takes these sets or regions and identifies a union set of unique regions that it saves as a .bed file. Where from different samples/groups overlap, it joins the break points of the regions across all the samples, each break point is either a start or stop location if a region. For instance. in sample

For example, consider an experiment where MACS2 detected enriched regions for two samples, Sample A and Sample B. In Sample A, a region is detected on chromosome 1 from 100bp to 200bp300bp,   noted as chr1:100-200; in sample 300. In Sample B, a region is detected as at chr1:105-210, to merge all the regions will get the following list of 160-360. The Quantify regions task will give the following union set of unique regions:

chr1:100-

...

160 (region detected in

...

Sample A only)

chr1:

...

160-

...

300 (region detected in both

...

Sample A and

...

Sample B)

chr1:

...

300-

...

360 (region detected in

...

Sample B only)

 

Numbered figure captions
SubtitleTextQuantify regions dialog
AnchorNamequantify_region

Image Removed

When combine the regions from different samples, the regions are over segmented, some regions are very short. In the dialog, short regions can be filtered out by specify minimum region size, if After generating a .bed file with the union set of unique regions, Quantify regions performs quantification using the same algorithm as Quantify to annotation model (Partek E/M) with the .bed file as the annotation model. 

Configuring Quantify regions

The Quantify regions dialog includes configuration options for generating the union set of unique regions and quantify reads to the regions (Figure 1). 

When regions from multiple samples are combined, a small offset in position between enriched regions in different samples can result in many very short unique regions in the union set. The Minimum region size option lets you filter out these very short regions. If a region is smaller than the specified cutoff, the region is removed from report.excluded. By default, this is set to 50bp, but may need to be adjusted depending on the size of regions you expect to see in your assay. 

Quantification options are the same as as in the Quantify to annotation model (Partek E/M) dialog. The Percent of read length is set to 50% by default to account for small offsets in position between enriched regions in different samples.  

 

Numbered figure captions
SubtitleTextQuantify regions dialog
AnchorNamequantify_region

Image Added

Quantify regions output

Quantify regions generates a Region counts data node with the number of counts of each region for each sample. This data node can be annotated with gene information using the Annotate regions task and analyzed using tasks that take counts data as input, such as normalization, PCA, and differential analysis. 

Similar to the Quantify to annotation model (Partek E/M) task report, the Quantify regions task report includes feature distribution information including a descriptive stats table, a distribution bar chart, a sample box plot, and sample histogram (Figure 3). 

 

Numbered figure captions
SubtitleTextQuantify regions task report
AnchorNameQuantify regions task report

Image Added

To download the .bed file with the union set of unique regions, click the Quantify regions task node, click Task details, click the regions.bed file in the Output files section, and click Download

References

  1. Xing Y, Yu T, Wu YN, Roy M, Kim J, Lee C. An expectation-maximization algorithm for probabilistic reconstructions of full-length isoforms from splice graphs. Nucleic Acids Res. 2006; 34(10):3150-60.


...