Partek Flow Documentation

Page tree

Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

The estimated abundance is derived from fragments per kilobase of transcript per million aligned reads (FPKM)/reads per kilobase of transcript per million aligned reads (RPKM) values delivered by the quantification algorithm. The concordance between known and estimated abundances is measured by the Rr2, a value ranging from 0% to 100%, where 100% corresponds to perfect estimation of abundance.

The studies of Roberts et al. (4) and Li et al. (10) measured the importance of sequence and position bias correction based on qPCR using a set of genes from Microarray Quality Control project (MAQC). Roberts and colleagues reported that, when both sequence and position bias are accounted for, the R r2 goes  goes up by about 5%, from 75.3% to 80.7%. The impact of position bias is reported to be small relative to that of sequence bias.

Li and Dewey (10) showed that RSEM quantification package, that implements only the position bias correction, delivers the R r2 of  of 69%. For comparison, running Cufflinks on the same dataset with the position bias correction obtained the R r2 of  of only 71%. When both sequence and position bias corrections were enabled in Cufflinks, the R r2 went  went up to 79%. We can conclude that, for real datasets, the impact of position bias is negligible, and the impact of sequence bias is about 5-8%.

...

Quantification was performed by Partek Flow with the EM algorithm and the plots of estimated vs. expected transcript abundance for Mix A and Mix B are in Figures 1 and 2 (respectively). As we can see, the R r2 is  is about 97% and, hence, there is very little room for improvement in abundance estimation.

...

We started with the full model containing four covariates and performed model selection based on two criteria: adjusted R r2 (computed for all possible models) and stepwise regression (with a cutoff p-value of 0.15). The combined approach allowed us to consider both practical and statistical significance of the covariates.

The full model for Mix A (Table 1) has the highest adjusted R r2, but it was only 0.01% better than the model consisting of the expected abundance and length only. The latter was also pointed to by the stepwise selection, so we nominated it as the best model (Table 2). While the length effect in the best model was statistically significant, the adjusted R r2 was  was only 0.44% higher than that of the benchmark model (Table 3). Therefore, we found little evidence of the practical significance of the length bias.

...