Partek Flow Documentation

Page tree
Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 9 Next »

To analyze scATAC-seq data, Partek Flow introduced a new technique - LSI (latent semantic indexing ). LSI combines steps of frequency-inverse document frequency (TF-IDF) normalization followed by singular value decomposition (SVD).  This returns a reduced dimension representation of a matrix. Although SVD and Principal components analysis (PCA) are two different techniques, the SVD has a close connection to PCA . PCA is simply an application of the SVD.  For users who are more familiar with scRNA-seq, you can think of SVD as analogous to the output of PCA. And similarly, the statistical interpretation of singular values is in the form of variance in the data explained by the various components. The singular values produced by the SVD are in order from largest to smallest and when squared are proportional the amount of variance explained by a given singular vector.


SVD task in Flow can be invoked in Normalization and scaling section by clicking any single cell counts data node (Figure 1). We recommend running SVD on normalized data, particularly TF-IDF normalized counts for scATAC-seq analysis.  

Figure 1. SVD task in Flow


To run SVD task

  • Click a single cell counts data node
  • Click the Exploratory analysis section in the toolbox
  • Click SVD

The GUI is simple and easy to understand. The SVD dialog is only asking to select the number of singular values to compute (Figure 2).  By default 100 singular values will be computed if users don't want to compute all of them.  However, the number could be adjusted manually or typed in directly.  Simply click the Finish button if you want to run the task as default.

Figure 2. Interface of SVD task in Partek Flow.



The output of TF-IDF normalization is a new data node that has been normalized by log(TF x IDF)We can then use this new normalized matrix for downstream analysis and visualization (Figure 2).






References

  1. Hao Y, Hao S, Andersen-Nissen E, et al. Integrated analysis of multimodal single-cell data. Cell. 2021;184(13):3573-3587.e29. doi:10.1016/j.cell.2021.04.048
  2. https://satijalab.org/seurat/articles/weighted_nearest_neighbor_analysis.html



Additional Assistance

If you need additional assistance, please visit our support page to submit a help ticket or find phone numbers for regional support.

  • No labels