Page History
To analyze scATAC-seq data, Partek Flow Flow introduced a new technique - LSI (latent semantic indexing )[1]. LSI combines steps of frequency-inverse document frequency (TF-IDF) normalization followed by singular value decomposition (SVD). This returns a reduced dimension representation of a matrix. Although SVD and Principal components analysis (PCA) are two different techniques, the SVD has a close connection to PCA. Because PCA is simply an application of the SVD. For users who are more familiar with scRNA-seq, you can think of SVD as analogous to the output of PCA. And similarly, the statistical interpretation of singular values is in the form of variance in the data explained by the various components. The singular values produced by the SVD are in order from largest to smallest and when squared are proportional the amount of variance explained by a given singular vector.
SVD task in Flow can be invoked in Normalization and scalingExploratory analysis section by clicking any single cell counts data node (Figure 1). We recommend running SVD on the normalized data, particularly the TF-IDF normalized counts for scATAC-seq analysis.
Numbered figure captions | ||||
---|---|---|---|---|
| ||||
To run SVD task,
...
Numbered figure captions | ||||
---|---|---|---|---|
| ||||
The task report for SVD is similar to PCA. Its output will be used for downstream analysis and visualization, including Harmony (Figure 3).
Numbered figure captions | ||||
---|---|---|---|---|
| ||||
References
- Cusanovich, D., Reddington, J., Garfield, D. et al. The cis-regulatory dynamics of embryonic development at single-cell resolution. Nature 555, 538–542 (2018). https://doi.org/10.1038/nature25981
...