Partek Flow Documentation

Page tree

Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Differential Expression Detection

...

  • Differential gene expression (GSA)
  • Differential gene expression (ANOVA)
  • Transcript expression analysis (Cuffdiff) – this option is only available on cufflinks quantification data node

...

GSA dialog

...

  • Cell type
  • Time
  • Cell type, Time
  • Cell type, Cell type * Time
  • Time, Cell type * Time
  • Cell type * Time
  • Cell type, Time, Cell type * Time

...

  • Cell type, Cell type * Time
  • Time, Cell type * Time
  • Cell type * Time

...

  • Cell type
  • Cell type, Time
  • Cell type, Time, Cell type * Time

...

  • A vs B (Cell type)
  • 5 vs 0 (Time)

only the following two models will be computed:

  • Cell type, Time
  • Cell type, Time, Cell type * Time

...

  • A vs B (Cell type)
  • 5 vs 0 (Time)
  • A*5 vs B*5 (Cell type * Time)

then only one model will be computed:

  • Cell type, Time, Cell type * Time

...

  • If invoked from a Partek E/M method output, the data node contains raw read counts and the default normalization is:
    • Normalize to total count (RPM)
    • Add 0.0001 (offset)
  • If invoked from a Cufflinks method output, the data node contains FPKM and the default normalization is:
    • Add 0.0001 (offset)

...

  • Lowest average coverage: the computation will exclude a feature if its geometric mean across all samples is below than the specified value
  • Lowest maximum coverage: the computation will exclude a feature if its maximum across all samples is below the specified value
  • Minimum coverage: the computation will exclude a feature if its sum across all samples is below than the specified value
  • None: include all features in the computation

...

  • Normal
  • Lognormal (the same as ANOVA task)
  • Lognomal with shrinkage (the same as limma-trend method 5)
  • Negative binomial
  • Poisson

...

  • Cell type
  • Cell type, Time
  • Cell type, Time, Cell type * Time

...

GSA report

...

  • Feature ID information: if transcript level analysis was performed, and the annotation file has both transcript and gene level information, both gene ID and transcript ID are displayed. Otherwise, the table shows only the available information.
  • Total reads: total number of raw read across all the samples. Raw reads are retrieved from quantification data node
  • Each contrast outputs p-value, FDR step up p-value, ratio and fold change in linear scale, LSmean of each group comparison in linear scale

By clicking on Optional columns, you can retrieve more annotation if there are any more annotation information in the annotation model you specified for quantification, like genomic location, strand information etc.
On the right of each contrast header, there is volcano plot icon ( Image Removed ). Select it to display the volcano plot on the chosen contrast (Figure 11).
Image Removed
Figure 11: Volcano plot on comparison A vs B. X-axis represents fold change (linear scale), Y-axis represents negative logged p-value (unadjusted), each dot is a feature. Horizontal line represents p-value of 0.05, two vertical lines represent fold change of -2 and 2. Lower left corner displays number of features passing the fold-change and p-value criteria
Feature list filter panel is on the left of the table (Figure 12). Click on the black triangle ( Image Removed ) to collapse and expand the panel.
Select the check box of the field and specify the cutoff, and press Enter to apply. After the filter has been applied, the total number of included features will be updated on the top of the panel (Result).
Image Removed
Figure 12: Feature list filter panel
The filtered result can be saved into a filtered data node by selecting the Generate list button at the lower-left corner of the table ( Image Removed ). Selecting the Download button at the lower-right corner of the table downloads the table as a text file to the local computer.
If lognormal with shrinkage method was selected for GSA, a shrinkage plot is generated on the report (Figure 13). The plot helps to determine the threshold of low expression features. X-axis shows the log2 value of average coverage. If there is an increase before a monotone decrease trend on the left side of the plot, you need to set a higher threshold on the low expression filter, detailed information on how to set the threshold can be found from GSA white paper.
Image Removed
Figure 13: Shrinkage plot

Differential gene expression (ANOVA)

...

ANOVA advanced options

...

  • User only reliable estimation results: there are situations when a model estimation procedure does not fail outright, but still encounters some difficulties. In this case, even it can generate p-value and fold change on the comparisons, but they are not reliable, they can be misleading. It is recommended that only reliable estimation results should be used, so the default of Use only reliable estimation results is set Yes
  • Display p-value for effects: When choose No, only p-value of comparison will be displayed on the report. When choose Yes in addition to the comparison's p-value, type III p-values are displayed for all the non-random terms in the model.

ANOVA report

...

Transcript expression analysis (Cuffdiff)

...

  • Class-fpkm: library size factor is set to 1, no scaling applied to FPKM values
  • Geometric: FPKM are scaled via the median of the geometric means of the fragment counts across all libraries [7]. This is the default option
  • Quartile: FPKMs are scaled via the ratio of the 75 quartile fragment counts to the average 75 quartile value across all libraries

...

  • Fr-unstranded: reads from the left-most end of the fragment in transcript coordinates map to the transcript strand, and the right-most end maps to the opposite strand. E.g. standard Illlumina
  • Fr-firststrand: reads from the left-most end of the fragment in transcript coordinates map to the transcript strand, and the right-most end maps to the opposite strand. The right-most end of the fragment is the first sequenced or only sequenced for single-end reads. It is assumed that only the strand generated during first strand synthesis is sequenced. E.g. dUPT, NSR, NNSR
  • Fr-secondstrand: reads from the left-most end of the fragment in transcript coordinates map to the transcript strand, and the right-most end maps to the opposite strand. The left-most end of the fragment is the first sequenced or only sequenced for single-end reads. It is assumed that only the strand generated during second strand synthesis is sequenced. E.g. Directonal Illumina, standard SOLiD.

...

  • NOTEST: not enough alignments for testing
  • LOWDATA: too complex or shallowly sequences
  • HIGHDATA: too many fragments in locus
  • FAIL: when an ill-conditioned covariance matrix or other numerical exception presents testing

...

[1] Benjamini, Y., Hochberg, Y. (1995). Controlling the false discovery rate: a practical
and powerful approach to multiple testing, JRSS, B, 57, 289-300.
[2] Storey JD. (2003) The positive false discovery rate: A Bayesian interpretation and
the q-value. Annals of Statistics, 31: 2013-2035.
[3] Auer, 2011, A two-stage Poisson model for testing RNA-Seq
[4] Burnham, Anderson, 2010, Model selection and multimodel inference
[5] Law C, Voom: precision weights unlock linear model analysis tools for RNA-seq read counts. Genome Biology, 2014 15:R29.
[6] http://cole-trapnell-lab.github.io/cufflinks/cuffdiff/index.html#cuffdiff-output-files
[7] Anders S, Huber W: Differential expression analysis for sequence count data. Genome Biology, 2010
Last revision: May 24, 2016
Copyright © 2016 by Partek Incorporated. All Rights Reserved. Reproduction of this material without express written consent from Partek Incorporated is strictly prohibited.Partek Flow's powerful statistical analysis tools help identify differential expression patterns in the dataset. These can take into account a wide variety of data types and experimental designs.

Children Display
maxLevel2
minLevel2
excludeAdditional Assistance