Partek Flow Documentation

Page tree
Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 18 Next »


Overview of Trajectory Analysis

Cells undergo changes to transition from one state to another as part of development, disease, and throughout life. Since those changes can be gradual, trajectory analysis attempts to describe progress through a biological process as a position along a path. Because biological processes are often complex, trajectory analysis builds branching trajectories where different paths can be chosen at different points along the trajectory. The progress of a cell along a trajectory from the starting point or root, can be quantified as a numeric value, pseudotime. 

In Partek Flow, we use tools from Monocle 3 (1) to build trajectories, identify states and branch points, and calculate pseudotime values. The output of Trajectory analysis task includes an interactive 2D/3D visualization for viewing the trajectory trees and setting the root states (starting points of the trajectories). From the Trajectory analysis report, you can run a second task, Calculate pseudotime, which adds a numeric cell-level attribute, Pseudotime, calculated using the chosen root states. 

Prerequisites for the Analysis

Trajectory analysis by Monocle 3 requires data normalization and preprocessing. Regarding the normalization, we suggest to first use the Normalization and scaling section of the toolbox to normalize by counts per million (CPM), and add offset of 1. After that, launch the Trajectory analysis on the Normalized counts node.

According to the Monocle 3 authors, you may want to filter in the top 5,000 genes with the highest variance (2,000 genes for datasets with fewer than 5,000 cells, and 300 genes for datasets with fewer than 1,000 cells) (1). Those number should be used as a guidance for the first-pass analysis and may need to be optimized, depending on the project at hand and the biological question.

Setting up Trajectory Analysis

To run Trajectory analysis tool, select the Normalized counts data node (or equivalent) and go to the toolbox: Exploratory analysis > Trajectory analysis

These configuration dialog presents four options.

  1. Dimensionality of reduced space. This option specifies the number of UMAP dimensions that the original data are reduced to, in order to learn the trajectory tree (dimensionality of original data equals the number of genes). Default is two, meaning that the trajectory plot will be draw in two dimensions. To get a 3D trajectory plot, increase this option to 3.
  2. Scaling. Normalized expression values can be further transformed by scaling to unit variance and zero mean (i.e. converting to Z score). The use of this option is recommended (1).
  3. Data is logged. Select this option if the data have already been log-transformed upstream. When selected, Monocle 3 will skip the log2 step on the input data (see below).

Under the hood, Monocle 3 will perform log2 transformation of the gene count matrix (if Data is logged was unselected), scale the matrix (if Scaling was selected), and project the gene count matrix into the top 50 principal components. Next, the dimensionality reduction will be implemented by UMAP (using default settings of the reduce_dimension command).

Trajectory Analysis Result

Result of running Trajectory analysis in Partek Flow is the Trajectory result data node. Double clicking on the node opens a Data Viewer window with the trajectory plot (Figure 1). Cell trajectory graph shows position of each cell (blue dot) with respect to the UMAP coordinates (axes). Cell trajectories (one or more, depending on the data set) are depicted as black lines. Gray circles are trajectory nodes (i.e. cell communities).


Figure 1. Cell trajectory graph. Blue dots are individual cells (total count is displayed in the upper right). Black line represents the structure of the trajectory graph. Gray circles are nodes or leaves. The axes represent UMAP coordinates.

To show / hide cell trajectory tree and trajectory nodes, use the Extra data option on the Content card (Figure 2).


Figure 2. Extra data card enables the user to turn the trajectory tree and the trajectory nodes on or off

Pseudotime Analysis

To perform pseudotime analysis, you need to point to the cells at the beginning of the biological process you are interested in. For example, cells at the earliest stage of differentiation sequence. To start, select the root cell nodes (gray circles) by left-click. If the trajectory result consists of more than one trajectory tree, you can specify more than one root node, e.g. one root node per trajectory tree (ctrl & click). If no root node is specified for a tree, that tree will not be included in the pseudotime calculation. Figure 3 shows an example where seven root nodes were identified.


Figure 3. Identification of root nodes for pseudotime analysis. The selected nodes are in dark gray

Once you have identified all the root nodes, push the Calculate pseudotime button in the Selection panel (Figure 4).


Figure 4. Once the root cell nodes are selected, use the Calculate pseudotime button to start the calculation. In this example, seven root nodes were specified

As a result, the cells will be annotated by pseudotime, using green to red gradient (start and end, respectively) (Figure 5). If, for a particular tree, no root node has been identified, those cells will be omitted from the pseudotime calculation and will be colored in gray (not shown).


Figure 5. Cells annotated by pseudotime, from start (green) to end (red)

Following pseudotime calculation, three types of cell nodes will be annotated on the plot (in addition to the intermediate nodes from the previous step).

  1. Root node (white). Root nodes are start points of the pseudotime and were defined by the user in the previous step (e.g. node 7 in Figure 6).
  2. Branch node (black). Branch nodes indicate where the trajectory tree forks out; i.e. each branch represents a different cell fate or different trajectory (e.g. nodes 1 and 4 in Figure 6).
  3. Leaf (light gray). Leaves correspond to different cell fates / different trajectory outcomes (e.g. nodes 3, 8, and 11 in Figure 6). The leaves correspond to cell states of Monocle 2.


Figure 6. Following pseudotime analysis, four types of nodes can be identified on the trajectory plot. Gray circles - intermediate nodes, white circles - root nodes (beginning of pseudotime), black circles - branch nodes (splitting of differentiation pathway), and light gray nodes - leaves (outcome of differentiation pathway)

Difference Between Monocle 3 and Monocle 2

Major updates in Monocle 3 (compared to Monocle 2) include:

  • Monocle 3 learns the principal trajectory graph in the UMAP space;
  • the principal graph is smoothened and small branches are excluded;
  • support for principal graphs with loops and convergence points;
  • support for multiple root nodes.

References

  1. Cao J, Spielmann M, Qiu X, Huang X, Ibrahim DM, Hill AJ, Zhang F, Mundlos S, Christiansen L, Steemers FJ, Trapnell C, Shendure J. The single-cell transcriptional landscape of mammalian organogenesis. Nature. 2019 Feb;566(7745):496-502. doi: 10.1038/s41586-019-0969-x. Epub 2019 Feb 20. PMID: 30787437; PMCID: PMC6434952.


Additional Assistance

If you need additional assistance, please visit our support page to submit a help ticket or find phone numbers for regional support.

Your Rating: Results: 1 Star2 Star3 Star4 Star5 Star 0 rates


  • No labels