Partek Flow Documentation

Page tree

Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

When "Lognormal with shrinkage" is enabled, a separate shrinkage plot is displayed for each design (Figure 4). First, a lognormal linear model is fitted for each gene separately, and the standard deviations of residual errors are obtained (green dots in the plot). Applying shrinkage amounts to two more steps. We look at how the errors change depending on the average gene expression and we estimate the corresponding trend (black curve). Finally, the original error terms are adjusted (shrunk) towards the trend (red dots). The adjusted error terms are plugged back into the lognormal model to obtain the reported results such as p-value.

 

Numbered figure captions
SubtitleTextShrinkage plot for a two group study with four observations per group. Blue arrows show how the error terms for transcripts ERCC-00046 and ERCC-00054 are adjusted up and down, correspondingly
AnchorNameShrinkage plot

 

...

For instance, in Figure 5 it looks like a threshold of 2 can get us what we want. Since the x axis is on the log2 scale, the corresponding value for "Lowest average coverage" is 22=4 (Figure 6). After we set the filter that way and rerun GSA, the shrinkage plots takes the required form (Figure 7).

 

Numbered figure captions
SubtitleTextAfter resetting "Average coverage" threshold to 4 (Figure 6), the left part of shrinkage plot displays the desirable monotone decreasing trend. Note that the left boundary on the x axis becomes log2(4) = 2
AnchorNameAverage coverage threshold

...

Speaking of higher expression features, presently GSA has no automatic method to separate "abnormal" and "normal" features, so the user has to do some eyeballing of the shrinkage plot. However, for the purpose of investigating standalone outliers GSA can quantify the benefit of shrinkage in a well grounded way. In order to do that, one can enable both Lognormal and Lognormal with shrinkage in Advanced Options (Figure 9). Image Removed
Figure 9:

 

Numbered figure captions
SubtitleTextTo quantify the benefit of shrinkage for any particular feature, enable these two models in "Custom" mode.
AnchorNameCustome mode

Image Added

 

Figure 10 contains a pie chart for the dataset whose shrinkage plot is displayed in Figure 4. Because of a small sample size (two groups with four observations each) we see that, overall, shrinkage is beneficial: for an "average" feature, Akaike weight for feature-specific Lognormal is 25%, whereas Lognormal with shrinkage weighs 75%.Figure 10:

Image Removed
Numbered figure captions
SubtitleTextPiechart for the small sample dataset from Figure 4
AnchorNameSmall sample piechart

Image Added


At the same time, if we look at ERCC-00046 specifically (Figure 11) we see that Lognormal with shrinkage fits so bad that its Akaike weight is virtually zero, despite having fewer parameters than feature-specific Lognormal. Image Removed
Figure 11:
Numbered figure captions
SubtitleTextFor ERCC-00046, feature specific Lognormal model is dominant and shrinkage is the least likely to do well.
AnchorNameGSA report

Image Added

 

Using multimodel inference appears to be a better alternative to the ad hoc method in DESeq2 that switches shrinkage on and off all the way. Once again, it is both technically possible and emotionally tempting to automate the handling of abnormal features by enabling both Lognormal models in GSA and applying them to all of the transcripts. Unfortunately, that can make the results less reproducible overall, even though it is likely to yield more accurate conclusions about the drastically outlying features.

...