Partek Flow Documentation

Page tree

Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

The Peak calling task is used to detect enriched genomic regions on reads generated from nucleic acid enrichment experiments such as ChIP-seq, DNase-seq, and MeDIP-seq etc. experiments. Partek® Flow®  Partek Flow provides a the widely used method of MACS2--model-based analysis1 (http://liulab.dfci.harvard.edu/MACS/) to find peaks , it . It can be used performed with or without control sample.

MACS2 is used to demonstrate the task setup in this manual. However, Flow also provides the MACS3 task which has the same interface. If you would like to make the switch, please talk to the Flow Admin or tech support team.

MACS2 dialog

Selecting MACS2 from the context sensitive menu will bring up the MACS2 task dialog, . The interface will be different depends appear differently depending on the input aligned data node .and whether there are sample attributes available in the data Data tab.

If the selected aligned data node was imported, the reference assembly the used during data aligned to alignment needs to be specified, choose the assembly the sequence aligned to . Choose the Assembly from the drop-down list within the MACS2 dialogue (Figure 1); if the alignment was performed in . If the selected aligned data node was generated by Partek Flow, this option will not appear.
 

Numbered figure captions
SubtitleTextMACS2 dialog: manually add ChIP vs control pairs for peak detection
AnchorNameMACS2 dialog: no attributes

 

...

Image Added



 The effective Effective genome size is the genome size can be sequenced. Because of the  repetitive features on the chromosomes, the actual mappable genome size will be smaller than the original size,  must be configured prior to running the peak caller. It refers to the size of the genomic regions that are actually mappable. This size is smaller than the actual size of an organism's whole genome because of the presence of repetitive features. They are typically about 70%-90% of the whole genome size. There are presets of 4 species based on MACS2 recommendation1 for this parameter:

hs – human, size is 2.7e9

mm – mouse, size is 1.87e9

ce – C. elegans, size is 9e7

dm – fruit fly, size is 1.2e8

When Other is selected, a specific value of the effective genome size needs to be specified in bps as unit (Figure 2).

 . The MACS2 authors1  have recommended presets available for four different species. Select from the drop-down menu the preset that best describes the genome you are working with. They are as follows:

  • Human (Homo sapiens) – 2.7 x 109
  • Mouse (Mus musculus) – 1.87 x 109
  • C. elegans – 9 x107
  • Fruitfly (Drosophila melongaster) – 1.2 x 108

If none of these presets match your genome of interest, select Other... Then enter the effective genome size (Figure 2). The values are in base pairs (bps). Consult the MACS documentation for guidance on selecting the best effective genome size for your experiment.


Numbered figure captions
SubtitleTextSpecify other species effective genome size by manually type in the value
AnchorNameg size other

Image Modified



 When sample attribute is not specified, for instance there are only two sample -- ChIP and mock as sample name, For data where no sample attributes are specified, the peak detection pairs needs need to be manually defined (. In the example in Figure 1).In , there only two samples. Under the Define pairs section, the left panel list lists all the sample names , add uploaded to the project (H3K27 and Mock).  Add one pair at a time , select ChIP sample to put in  IP by dragging the corresponding samples to either the IP panel on the top-right , choose control sample to put in or the Control panel  panel on the bottom-right. If there is no control sample samples are present in the experiment, the Control panel can be blankleave the Control panel blank. If more than one ChIP or Control samples are added, the samples will be combined (or pooled) in during the analysis. If  After defining a pair, click the Add pair button.


For data where the sample attributes are defined (Figure 3), you will have an additional option to add pairs pairing or grouping based on the attributesample attributes. For instance Figure 3 is show  shows an example data dataset with 4 samples, 2 time pointpoints, and there in is one ChIP IP sample and one input Input sample in each time point. 


Numbered figure captions
SubtitleTextExperiment example data illustrate sample with two attributes: IP and Time
AnchorNamechip data table

Image Modified



 When select running the MACS2 task, the default option is to use sample attribute to add multiple pairs at one button click (Figure 4)sample attributes will be used to define the multiple pairs (Figure 4). There is an IP-Input pair for each time point, so the Pair attribute is the Time attribute. The Control attribute is the attribute that differentiates between the Input and IP groups, and in this example, it is the ChIP attribute.  Finally, the Control term is labeled as Input in the example. 


Numbered figure captions
SubtitleTextSpecify IP vs control pairs based on sample attributes
AnchorNamechip_attribute

Image Modified


 There are IP-Input pair in each time point, so the pair attribute is Time; Control attribute is the attribute contains IP and input group, which is ChIP, the control term is labeled as Input in the example, when click Generate pairs,  Click Generate pairs and the two pairs will be automatically added to the Pairs table at once (Figure 5). 



Numbered figure captions
SubtitleTextTwo IP vs input sample pairs are added in the Pairs table
AnchorNamechip_pair_table

Image Modified

If multiple pairs are added in the Pairs table, the peak detection is performed on each pair independently.

 


Peaks report

In the task report, each pair will generate a list of peaks displayed in a table (Figure 6) . Use the drop down menu next to Peaks detected for... to select the pair.


Numbered figure captions
SubtitleTextPeaks report on each IP vs control pair
AnchorNamechip_report

Image Modified

In the report table, each row is a peak, besides genomic location of the peak region, it also include region of a peak and includes the following information:


  • Absolute summit: base pair location of peak summit
  • Pileup: pileup height at peak summit
  • -log10(pvalue): negative log10 pvalue for

...

  • the peak summit
  • Fold

...

  • enrichment: fold

...

  • enrichment for

...

  • the peak summit against random Poisson distribution with local lambda
  • -log10(qvalue): negative log10 qvalue at peak summit

...

  • a peak name generated by the MACS2 algorithm

Click on the browse to peak button (Image Modified) to invoke chromosome view and zoom into that location.

Click the Download button at the lower-right corner to download the peaks in a text file.

References

  1. Zhang Y, Liu T, et al. Model-based Analysis of ChIP-Seq (MACS). Genome Biol.   20082008;9(9):R137.


Additional assistance


 

Rate Macro
allowUsersfalse

...