Partek Flow Documentation

Page tree

Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Principal components (PC) analysis (PCA) is an exploratory technique that is used to describe the structure of high dimensional data by reducing its dimensionality. It is a linear transformation that converts n original variables (typically: genes or transcripts) into n new variables, which are called PCs, they have three important properties:

  • PCs are ordered by the amount of variance explained
  • PCs are uncorrelated
  • PCs explain all variation in the data

PCA is a principal axis rotation of the original variables that preserves the variation in the data. Therefore, the total variance of the original variables is equal to the total variance of the PCs.

To analyze scATAC-seq data, Partek Flow introduced a new technique - LSI (latent semantic indexing ).  Singular value decomposition (SVD) on the TD-IDF matrix  


If read quantification (i.e. mapping to a transcript model) was performed by Partek® E/M algorithm, PCA can be invoked on a quantification output data node (Gene counts or Transcript counts) or, after normalization, on a Normalized counts data node. Select a node on the canvas and then PCA in the Exploratory analysis section of the context sensitive menu.

...