Partek Flow Documentation

Page tree

Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Descriptive statistics task can be invoked on matrix data node e.g. gene counts, normalized counts Gene Counts, Normalized Counts data node in bulk RNA seq analysis pipeline or single cell Single Cell counts data Data node etc. It calculates measures of central tendency and variability on observations or features of the matrix data.

...

  • Click on a counts data node
  • Choose Descriptive Statistics in Pre-analysis tools Statistics section of the toolbox (Figure 1)

Numbered figure captions
SubtitleTextDescriptive statistics menu
AnchorNamedesc

Image RemovedImage Added


This will invoke the dialog configuration dialog; use it to specify which calculation(s) will be performed on cells (or samples for a bulk analysis data node) or features (Figure 2).


Numbered figure captions
SubtitleTextSelect to calculate descriptive statistics on samples/cells or cellsfeatures
AnchorNameobs
Image Removed

A second data node of a different type than the first selected data node is chosen automatically. The second data node can selected manually using the Select data node button. 

  • Click Select data node to choose the second data node you want to merge (Figure 1)

...

SubtitleTextOpening the data node selector
AnchorNamePicking a second node

...

Image Added


The available statistics are listed on the left panel, suppose "x1, x2, ..., xn"represent an array of numbers

  • Coefficient of variation (CV): Image Added s represent the standard deviation

  • Geometric mean: g=Image Added

  • Max: Image Added

  • Mean: Image Added
  • Median: when n is odd, median is Image Added, when n is even, median is Image Added

  • Median absolute deviation: Image Added, where Image Added

  • Min: Image Added

  • Number of cells: Available when Calculate for is set to Features. Reports the number of cells with the value [<, <=, =, !=, > >=] (select one from the drop down list) than the cut off value entered in the text box. The cut off will be applied to the values present in the input data node, i.e. if invoked on non-normalised data node, the values are raw counts. For instance, use this option if you want to know the number of cells in which each feature was detected; possible filter: Number of cells whose value > 0.0
  • Percent of cells: Available when Calculate for is set to Features. Reports the number of cells with the value [<, <=, =, !=, > >=] (select one from the drop down list) than the cut off value entered in the text box. 
  • Number of features: Available when Calculate for is set to Cells. Reports the number of features with the value [<, <=, =, !=, > >=] (select one from the drop down list) than the cut off value entered in the text box. The cut off will be applied to the values present in the input data node, i.e. if invoked on non-normalised data node, the values are raw counts. For example, use this option if you want to know the number of detected genes per each cell; filter: Number of features whose value > 0.0
  • Percent of features: Available when Calculate for is set to Cells. Reports the fraction of features with the value [<, <=, =, !=, > >=] (select one from the drop down list) than the cut off value entered in the text box.
  • Q1: 25th percentile

  • Q3: 75th percentile

  • Range: xmax - x min
  • Standard deviation: Image Added where Image Added
  • Sum: Image Added
  • Variance: Image Added


Left click to select measurement and drag to move to the right panel one at a time, or when you mouse over on a measurement, click on the green plus button to move to the right panel. When Sample (Cell) is select, the calculation will be performed on all the features in the input matrix for each sample (or cell). When Feature is selected, the calculation will be performed across all the samples (cells) in the input matrix  for each feature.

In addition, when Feature is selected, there is an extra Group by option (Figure 3)


Numbered figure captions
SubtitleTextChoosing the second matrixChoose a categorical attribute to calculate the statistics on each subgroup
AnchorNamePicking the second node

Image Removed

  • Click Finish to run

...

Group by

Image Added


From the drop-down list, choose a categorical attribute to calculate the descriptive statistics on all the subgroups for each feature.

The output of the task is a matrix: Cell stats (result of Calculate for Cells) or Feature stats (result of Calculate for Features) (Figure 4). The results can be visualized in the Data Viewer.


Numbered figure captions
SubtitleTextMerged counts output
AnchorNameMerged counts output

Image Removed

The intersection of observations (cells and/or samples) from the two input matrices is included in the merged matrix. 

Once two data types have been merged, they can be split using Split matrix.

For a practical example using Merge matrices, please see our tutorial on Analyzing CITE-Seq Data

Descriptive statistics task produces either a Cell stats (calculation per cell) or Feature stats (calculation per feature) data node
AnchorNamestats-nodes

Image Added


Additional assistance


Rate Macro
allowUsersfalse

...