Page History

Library size normalization is the simplest strategy for performing scaling normalization. But composition biases will be present when any unbalanced differential expression exists between samples. The removal of composition biases is a well-studied problem for bulk RNA sequencing data analysis. However, single-cell data can be problematic for these bulk normalization methods due to the dominance of low and zero counts[1]. To overcome this, Partek Flow wrapped the calculateSumFactors() function from R package scran. It pools counts from many cells to increase the size of the counts for accurate size factor estimation. Pool-based size factors are then “deconvolved” into cell-based factors for normalization of each cell’s expression profile[1].

Scran deconvolutionLatent semantic indexing (LSI) was first introduced for the analysis of scATAC-seq data by Cusanovich et al. 2018[1]. LSI combines steps of frequency-inverse document frequency (TF-IDF) normalization followed by singular value decomposition (SVD). Partek Flow wraps Signac's TF-IDF normalization for single cell ATAC-seq dataset. It is a two-step normalization procedure that both normalizes across cells to correct for differences in cellular sequencing depth, and across peaks to give higher values to more rare peaks[2].

TF-IDF normalization in Flow can be invoked in Normalization and scaling section by clicking any single cell counts data node (Figure 1).

Figure 1. Scran deconvolution task in Normalization and scaling section in Flow.

...

Partek Flow Documentation

Page tree

Versions Compared

Old Version 1

New Version 2

Key