Partek Flow Documentation

Page tree
Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

Version 1 Next »

Introduction

The purpose of scaling is to remove the variation of response that is described by certain nuisance experimental factors, meaning that scaling can also be called removal of unwanted variation. To distinguish scaling from similarly named procedures such as RUV normalization [1] it is important to keep in mind the following two points. First, the experimental factors participating in scaling are always known (observed) before the model is fitted.

Second, it is important to understand why scaling needs to be performed as a separate step. In the context of single cell analysis, scaled data are meant to be used for subpopulation identification. We assume that the response variance is explained by an unobserved (latent) factor of interest that identifies the subpopulations and by some observed nuisance factor(s). If we perform clustering on unscaled data, it is possible that the data will cluster by the nuisance factor as opposed to the factor of interest. We can compare this to a bulk-RNA experiment where both the factor of interest and the nuisance factors are known and the goal is to find features that are differentially expressed w.r.t the factor of interest. In that case, a separate scaling task is not necessary because we can simply include all of the factors in the model and specify the contrasts only w.r.t the factor of interest. Therefore, if for some reason we assume that all of the factors are observed we should skip scaling and apply a bulk-RNA type of differential expression analysis.

Note also that after the k cell types have been identified one can add the corresponding factor with k levels to the data, and the new factor can be treated as observed [3] in downstream analysis.

References

[1] Risso et al, 2014, Normalization of RNA-Seq data using factor analysis[3] Seurat pipeline, “Finding differentially expressed genes (cluster biomarkers)” step

[2] Seurat pipeline, “Scaling the data and removing unwanted sources of variation” step

[3] Seurat pipeline, “Finding differentially expressed genes (cluster biomarkers)” step

 

Additional Assistance

If you need additional assistance, please visit our support page to submit a help ticket or find phone numbers for regional support.

Your Rating: Results: 1 Star2 Star3 Star4 Star5 Star 5 rates

  • No labels