Partek Flow Documentation

Page tree

Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

To analyze scATAC-seq data, Partek Flow introduced a new technique - LSI (latent semantic indexing ). LSI combines steps of frequency-inverse document frequency (TF-IDF) normalization followed by singular value decomposition (SVD).  This returns a reduced dimension representation of a matrix. Although SVD and Principal components analysis (PCA) are two different techniques, the SVD has a close connection to PCA . PCA is simply an application of the SVD.  For users who are more familiar with scRNA-seq, you can think of SVD as analogous to the output of PCA. And similarly, the statistical interpretation of singular values is in the form of variance in the data explained by the various components. The singular values produced by the SVD are in order from largest to smallest and when squared are proportional the amount of variance explained by a given singular vector.

If read quantification (i.e. mapping to a transcript model) was performed by Partek® E/M algorithm, PCA can be invoked on a quantification output data node (Gene counts or Transcript counts) or, after normalization, on a Normalized counts data node. Select a node on the canvas and then PCA in the Exploratory analysis section of the context sensitive menu.

There are two options for  features contribute (Figure 1):

equally: all the features are standardized to mean of 0 and standard deviation of 1 .  This option will give all the features equal weight in the analysis, this is the default option for e.g bulk RNA-seq data.

by variance: the analysis will give more emphasis to the features with higher variances. This is the default option for e.g. single cell RNA-seq data

If the input data node is in linear scale, you can perform log transformation on PCA calculation. 


SVD task in Flow can be invoked in Normalization and scaling section by clicking any single cell counts data node (Figure 1). We recommend running SVD on normalized data, particularly TF-IDF normalized counts for scATAC-seq analysis.  

Numbered figure captions
SubtitleTextPCA setup dialogSVD task in Flow
AnchorNamepacsvd_configtask

Image RemovedImage Added


The PCA task creates a new task node, and to open it and see the result, do one of the following: select the PCA task node, proceed to the context sensitive menu and go to the Task result; or double-click on the PCA task node. The report containing eigenvalues, PC projections, component loadings, and mapping error information for the first three PCs. 

...