Page History

...

Numbered figure captions

SubtitleText	GSA Advanced Options
AnchorName	Advanced GSA options

Image RemovedImage Added

Currently, GSA is capable of considering the following five response distributions: Normal, Lognormal, Lognormal with shrinkage, Negative Binomial, Poisson (Figure 1). The GSA interface has an option to restrict this pool of distributions in any way, e.g. by specifying just one response distribution. The user also specifies the factors that may enter the model (Figure 2A) and comparisons for categorical factors (Figure 2B).

...

Numbered figure captions

SubtitleText
AnchorName	GSA Attributes

Image RemovedImage Added

A) Choosing factors (attributes) in GSA

Image RemovedImage Added

B) Choosing comparisons in GSA

...

Numbered figure captions

SubtitleText	For each feature, Akaike weights and other statistics are available via "View extra details report" button
AnchorName	GSA Extra Details Report

Image RemovedImage Added

Obtaining reproducible results in GSA

...

Numbered figure captions

SubtitleText	Shrinkage plot for a two group study with four observations per group. Blue arrows show how the error terms for transcripts ERCC-00046 and ERCC-00054 are adjusted up and down, correspondingly
AnchorName	Shrinkage plot

Image RemovedImage Added

All other things being equal, the comparison p-value goes up as the magnitude of error term goes up, and vice-versa. As a result, the "shrunken" p-value goes up (down) if the error term is adjusted up (down). Table 1 reports some results for two features highlighted in Figure 4

...

Numbered figure captions

SubtitleText	After shrinkage is applied, p-values are adjusted in the same direction as the corresponding error terms.
AnchorName	p-value shrinkage table

Image RemovedImage Added

For a large sample size, the amount of shrinkage is small, (Figure 5), and the "Lognormal" and "Lognormal with shrinkage" p-values become virtually identical.

...

Numbered figure captions

SubtitleText	Shrinkage plot for a two group study with about 40 samples per group. Thanks to a large sample size, the error terms have almost no adjustment (green and red dots almost coincide)
AnchorName	Shrinkage plot

Image RemovedImage Added

One important usage of the shrinkage plot is a meaningful setting of low expression threshold in Low expression filter section (Figure 6). For features with low expression, the proportion of zero counts is high. Such features are less likely to be of interest in the study, and, in any case, they cannot be modeled well by a continuous distribution, such as Lognormal. Note that adding a positive offset to get rid of zeros does not help because that does not affect the error term of a lognormal model much. A high proportion of zeros can ultimately result in a drop in the trend in the leftmost part of the shrinkage plot (Figure 56).

A rule of thumb suggested by limma authors is to set the low expression threshold to get rid of the drop and to obtain a monotone decreasing trend in the left-hand part of the plot.

...

Numbered figure captions

SubtitleText	A meaningful value for "Lowest average coverage" threshold can be easily determined based on the shrinkage
AnchorName	Lowest average coverage threshold

Image RemovedImage Added

For instance, in Figure 5 it looks like a threshold of 2 can get us what we want. Since the x axis is on the log2 scale, the corresponding value for "Lowest average coverage" is 222^2=4 (Figure 6). After we set the filter that way and rerun GSA, the shrinkage plots takes the required form (Figure 7).

...

Numbered figure captions

SubtitleText	After resetting "Average coverage" threshold to 4 (Figure 6), the left part of shrinkage plot displays the desirable monotone decreasing trend. Note that the left boundary on the x axis becomes log2(4) = 2
AnchorName	Average coverage threshold

Image RemovedImage Added

Note that it is possible to achieve a similar effect by increasing a threshold of "Lowest maximal coverage", "Minimum coverage", or any similar filtering option (Figure 6). However, using "Average coverage" is the most straightforward: the shrinkage procedure uses log2(Average coverage) as an independent variable to fit the trend, so the x axis in the shrinkage plot is always log2(Average coverage) regardless of the filtering option chosen in Figure 6.

...

Numbered figure captions

SubtitleText
AnchorName	Average expression theshold

Image RemovedImage Added

A) Average expression threshold can be raised to get rid of low expression features with abnormal error terms, circled in blue

Image RemovedImage Added

B) Six low expression features (circled in blue) account for a very sharp increase in the trend which can have an unduly large effect on overall results

...

Numbered figure captions

SubtitleText	To quantify the benefit of shrinkage for any particular feature, enable these two models in "Custom" mode.
AnchorName	Custome mode

Image RemovedImage Added

Figure 10 contains a pie chart for the dataset whose shrinkage plot is displayed in Figure 4. Because of a small sample size (two groups with four observations each) we see that, overall, shrinkage is beneficial: for an "average" feature, Akaike weight for feature-specific Lognormal is 25%, whereas Lognormal with shrinkage weighs 75%.

...

Numbered figure captions

SubtitleText	Piechart for the small sample dataset from Figure 4
AnchorName	Small sample piechart

Image RemovedImage Added

At the same time, if we look at ERCC-00046 specifically (Figure 11) we see that Lognormal with shrinkage fits so bad that its Akaike weight is virtually zero, despite having fewer parameters than feature-specific Lognormal.

...

Numbered figure captions

SubtitleText	For ERCC-00046, feature specific Lognormal model is dominant and shrinkage is the least likely to do well.
AnchorName	GSA report

Image RemovedImage Added

Using multimodel inference appears to be a better alternative to the ad hoc method in DESeq2 that switches shrinkage on and off all the way. Once again, it is both technically possible and emotionally tempting to automate the handling of abnormal features by enabling both Lognormal models in GSA and applying them to all of the transcripts. Unfortunately, that can make the results less reproducible overall, even though it is likely to yield more accurate conclusions about the drastically outlying features.

...

Partek Flow Documentation

Page tree

Versions Compared

Old Version 41

New Version 42

Key

Obtaining reproducible results in GSA