Partek Flow Documentation

Page tree

Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Table of Contents

 


This tutorial presents an outline of the basic series of steps for analyzing a 10x Genomics Gene Expression with Feature Barcoding (antibody) data set in Partek Flow starting with the output of Cell Ranger.  

...

A rectangle, or task node, will be created for Split matrix along with two output circles, or data nodes, one for each data type (Figure 2). The labels for these data types are determined by features.csv file used when processing the data with Cell Ranger. Here, our data is labeled Gene Expression, for the mRNA data, and Antibody Capture, for the protein data. 

 


Numbered figure captions
SubtitleTextSplit matrix produces two data nodes, one for each data type
AnchorNameSplit matrix output

...

This produces a Single-cell QA/QC task node (Figure 4). 

 


Numbered figure captions
SubtitleTextSingle cell QA/QC produces a task node
AnchorNameOutput of Single cell QA/QC

...

The output is a Filtered single cell counts data node (Figure 6).

 


Numbered figure captions
SubtitleTextFiltered cells output
AnchorNameFiltered cells by protein count

...

This produces a Single-cell QA/QC task node (Figure 7). 

 


Numbered figure captions
SubtitleTextSingle cell QA/QC produces a task node
AnchorNameOutput of Single cell QA/QC (2)

...

The output is a Filtered single cell counts data node (Figure 9).

 


Numbered figure captions
SubtitleTextThere are now two Filtered single cell counts data nodes
AnchorNameFiltering out low-quality cells

...

Normalization produces a Normalized counts data node on the Gene Expression branch of the pipeline (Figure 12). 

 


Numbered figure captions
SubtitleTextBoth Antibody Capture and Gene Expression data has been normalizied
AnchorNameResults of normalization

 

...

Data nodes that can be merged with the Antibody Capture branch Normalized counts data node are shown in color (Figure 13).

 


Numbered figure captions
SubtitleTextChoosing a data node to merge
AnchorNameData node selector

...

The output is a Merged counts data node (Figure 14). This data node will include the normalized counts of our protein and mRNA data. The intersection of cells from the two input data nodes is retained so only cells that passed the quality filter for both protein and mRNA data will be included in the Merged counts data node.  


Numbered figure captions
SubtitleTextMerging data types prior to downstream analysis
AnchorNameMerging data types

Image Modified


Collapsing tasks to simplify the pipeline

...

Tasks that can for the beginning and end of the collapsed section of the pipeline are highlighted in purple (Figure 16). We have chosen the Split matrix task as the start and we can choose Merge matrices as the end of the collapsed section. 

 


Numbered figure captions
SubtitleTextTasks that can be the start or end of a collapsed task are shown in purple
AnchorNameViewing options for collapsing

...

  • Name the Collapsed task Data processing
  • Click Save (Figure 17)


Numbered figure captions
SubtitleTextNaming the collapsed task
AnchorNameNaming the collapsed task

Image Modified

The new collapsed task, Data processing, appears as a single rectangle on the task graph (Figure 18). 

 



Numbered figure captions
SubtitleTextCollapsed tasks are represented by a single task node
AnchorNameCollapsed task

...

When expanded, the collapsed task is shown as a shaded section of the pipeline with a title bar (Figure 19).

 


Numbered figure captions
SubtitleTextExpanding a collapsed task to show its components
AnchorNameExpanding a collapsed task

...

  • Click the Merged counts data node
  • Click Exploratory analysis in the toolbox
  • Click Scatter plot
  • Click Finish to run 
  • Double-click the Scatter plot task node to open it
  • Click 2D to switch to a 2D plot style (Figure 1920)

Numbered figure captions
SubtitleTextViewing the 2D scatter plot
AnchorName2D scatter plot

...

  • Click the Features tab in the Selection / Filtering section of the control panel
  • Type CD3 in the ID search bar of the Features tab
  • Click CD3_TotalSeqB in the drop-down (Figure 2021)

Numbered figure captions
SubtitleTextFiltering by values for a feature
AnchorNameFiltering by a feature

  • Click  to add a filter for CD3 protein expression
  • Set the CD3_TotalSeqB filter to <to = > 2 

This will select any cell with < => 2 normalized count for CD3 protein. Selected cells are shown in bold on the plot and, because we have CD3_TotalSeqB on one of our axes, the cut-off point chosen can be easily evaluated (Figure 2122). 

 


Numbered figure captions
SubtitleTextCD3+ cells are selected and shown in bold on the plot
AnchorNameCD3+ cells are selected

...

The x-axis now shows CD8a protein expression (Figure 2223).

 


Numbered figure captions
SubtitleTextSwitching axes on the scatter plot
AnchorNameSwitching axes

...

  • Type CD4 in the ID search bar of the Features tab
  • Click CD4_TotalSeqB in the drop-down
  • Click  to add a filter for CD4 protein expression
  • Set the CD4_TotalSeqB filter to <to = > 2 
  • Type CD8a in the ID search bar of the Features tab
  • Click CD8a_TotalSeqB in the drop-down
  • Click  to add a filter for CD8a protein expression
  • Set the CD8a_TotalSeqB filter to < 2 

This will select the cells in the upper left-hand section of the plot (Figure 2324). 

 


Numbered figure captions
SubtitleTextSelecting CD3+ CD4+ CD8- cells
AnchorNameSelecting cells on the scatter plot

...

This selects the cells in the lower right-hand section of the plot (Figure 2425). 

 


Numbered figure captions
SubtitleTextSelecting CD3+ CD4- CD8+ cells
AnchorNameSelecting CD8 T cells

...

  • Click Clear selection 
  • Choose Classifications from the Color by drop-down menu (Figure 2526).

Numbered figure captions
SubtitleTextClassified CD4 and CD8 T cells
AnchorNameClassified cells

...

Each point on the plot is a cell and the cells are colored by their cluster assignments (Figure 2627).

 


Numbered figure captions
SubtitleTextUMAP from protein expression data
AnchorNameUMAP on protein expression

...

  • Choose Expression from the Color by drop-down menu
  • Type CD4 in the search box and choose CD4_TotalSeqB from the drop-down (Figure 2728)

Numbered figure captions
SubtitleTextColoring by expression
AnchorNameColoring by a feature

Cells that express high levels of CD4 are colored blue on the plot (Figure 2829).

 


Numbered figure captions
SubtitleTextColoring by CD4 protein expression
AnchorNameColoring by protein expression

...

  • Click  to activate the lasso tool
  • Draw a lasso around the large blue group of cells at the bottom right of the plot to select them (Figure 2930)


Numbered figure captions
SubtitleTextSelecting the CD4 cluster
AnchorNameSelecting CD4 cells

Image Modified


  • Click  to filter to include only the selected cells
  • Click  to rescale the axes to the included cells 

...

  • Choose Graph-based from the Color by drop-down menu (Figure 3031)

Numbered figure captions
SubtitleTextProtein-based clustering results
AnchorNameClustering results from protein data

Again, the colors here indicate the cluster assignment for each cell. Because we ran clustering using only the protein expression data, the cluster assignments are based on each cells protein expression data. To help identify which cell types the clusters correspond to, we generate a group biomarkers table with every clustering result. Biomarkers are genes or proteins that are expressed highly in a clusters when compared with the other clusters. While the clustering was calculated using only the protein expression data, the biomarkers are drawn from both gene and protein expression data. 

The far-left right cluster, cluster 8, has several interesting biomarkers. The top biomarker, is CXCL13, a gene expressed by follicular B helper T cells (Tfh cells). Two of the other biomarkers are PD-1 protein, which is expressed in Tfh cells, promotes self-tolerance, and is a target for immunotherapy drugs; and TIGIT protein, another immunotherapy drug target that promotes self-tolerance. 

...

PD-1 expression is highest in cluster 8 with high expression throughout the cluster (Figure 3132).

 


Numbered figure captions
SubtitleTextPD-1 expression in helper T cells
AnchorNamePD-1 expression

...

It is interesting to note that this pattern of PD-1 expression is not easily discernible at the PDCD1 gene expression level (Figure 3233).

 


Numbered figure captions
SubtitleTextPDCD1 (PD-1) gene expression does not form a clear pattern
AnchorNamePD-1 gene expression

...

The Tfh cell marker, CXCL13, is highly and specifically expressed in cluster 8 (Figure 3334), so we will classify the cells from cluster 8 as Tfh cells. 

 


Numbered figure captions
SubtitleTextCXCL13 expression is strong in cluster 8
AnchorNameCXCL13 expression

  • Choose Graph-based from the Color by drop-down menu
  • Choose Graph-based from the Select by drop-down in the Attributes tab of the Selection / Filtering section of the control panel (Figure 3435)

Numbered figure captions
SubtitleTextChoosing to select by cluster
AnchorNameSelecting a cluster

  • Click the check box for to select cluster 8 (Figure 3536)


Numbered figure captions
SubtitleTextSelecting a cluster
AnchorNameSelecting a cluster

Image Modified


  • Click Classify selection 
  • Name the cells Tfh cells
  • Click Save 

...

  •  Choose Classifications from the Color by drop-down menu (Figure 3637)


Numbered figure captions
SubtitleTextColoring by classification
AnchorNameColoring by classification

Image Modified

To apply the classification so that it would be available in downstream tasks like differential analysis, we would click Apply classifications. Classifications that are not applied are not available in downstream analysis tasks, but are saved in a draft state on the task report where they were created. Here, we will not save the classification, but we will see how to do this later in the tutorial. 

...

  • Click Apply 
  • Click Finish to run (Figure 3738)

Numbered figure captions
SubtitleTextConfiguring PCA to run on the Gene Expression data
AnchorNameConfiguring PCA

...

  • Mouse over the Scree plot to identify the point where additional PCs offer little additional information (Figure 3839)

Numbered figure captions
SubtitleTextIdentifying an optimal number of PCs
AnchorNameScree plot for Gene Expression data

...

  • Click the Merged counts data node
  • Click Exploratory analysis in the toolbox
  • Click Graph-based clustering 
  • Click Gene Expression for Include features where "Feature type" is
  • Click Configure to access the advanced settings
  • Set Number of principal components to 15
  • Click Apply 
  • Click Finish to run (Figure 3940)

Numbered figure captions
SubtitleTextRunning Graph-based clustering on the Gene Expression data
AnchorNameGraph-based clustering configuration

...

The UMAP task report includes a scatter plot with the clustering results coloring the points (Figure 41).

 


Numbered figure captions
SubtitleTextUMAP calculated on Gene Expression values. Colored by Graph-based clustering results.
AnchorNameUMAP results

...

  • Choose Expression from the Color by drop-down menu
  • Type NKG7 in the search box and choose NKG7 from the drop-down (Figure 45)


Numbered figure captions
SubtitleTextColoring by NKG7 expression
AnchorNameColoring by a gene

Image Modified

This will color the plot by NKG7 gene expression, a marker for cytotoxic cells. We can color by two T cell protein markers to distinguish cytotoxic T cells from helper T cells. 

...

By default, any cell that expresses >= 1 normalized count of NKG7 is now selected (Figure 48).

 


Numbered figure captions
SubtitleTextSelecting by NKG7 expression
AnchorNameSelecting by NKG7

...

We have now selected only cells that express >= 1 normalized count for NKG7 gene and CD3 protein, but also have <= 2 normalized count for CD4 protein (Figure 49).

 


Numbered figure captions
SubtitleTextFiltering using multiple genes and proteins
AnchorNameFiltering using multiple genes and proteins

...

We have now selected the CD4 positive, CD3 positive, NKG7 negative helper T cells (Figure 50).

 


Numbered figure captions
SubtitleTextModifying the selection criteria lets us select helper T cells
AnchorNameSelecting helper T cells

...

The zoom level will also be reset (Figure 52).

 


Numbered figure captions
SubtitleTextResetting filters also resets the zoom level
AnchorNameReset zoom to show UMAP

...

 There are several clusters that show high levels of CD19 protein expression (Figure 53). We can filter to these cells to examine them more closely.

 


Numbered figure captions
SubtitleTextViewing CD19 protein expression on the UMAP plot
AnchorNameCD19 expressing cells

...

  • Choose Graph-based from the Color by drop-down menu (Figure 55)

 


Numbered figure captions
SubtitleTextViewing B lymphocyte clusters
AnchorNameViewing B lymphocyte clusters

...

This will color the plot by IGHD and IGHA1 (Figure 57).

 


Numbered figure captions
SubtitleTextColoring by two genes from the Group biomarkers table
AnchorNameColoring by two biomarkers

...

This produces a Filtered groups data node (Figure 62).

 


Numbered figure captions
SubtitleTextFilter groups output
AnchorNameFIlter groups output

...

This will produce two data nodes, one for each data type (Figure 63).

 


Numbered figure captions
SubtitleTextSplit matrix can also re-split the data
AnchorNameSplit matrix

...

The report lists each feature tested, giving p-value, false discovery rate adjusted p-value (FDR step up), and fold change values for each comparison (Figure 65).

 


Numbered figure captions
SubtitleTextGSA report for the protein expression data
AnchorNameGSA report

...

This opens a violin plot showing CD25 expression for cells in each of the classifications (Figure 66).

 


Numbered figure captions
SubtitleTextViolin plot showing CD25 protein expression
AnchorNameViolin plot

...

This generates a customized heat map to illustrate how the cell types differ in their protein expression (Figure 68). 


Numbered figure captions
SubtitleTextCustomized heat map illustrating protein expression differences between cell types
AnchorNameMALT heat map

Image Modified


Gene expression

We can use a similar approach to analyze the gene expression data.

...

Each gene is shown as a point on the plot with cut-off lines for fold change and p-value or FDR step up set using the control panel on the left (Figure 70). The number of genes up and down regulated according to the cut-offs is listed at the bottom of the plot. Mousing over a point shows the gene name and other information. 

 


Numbered figure captions
SubtitleTextVolcano plot for Activated vs. Mature B cells
AnchorNameVolcano plot

...

The number at the top of the filter will update to show the number of included genes (Figure 71).

 


Numbered figure captions
SubtitleTextFiltering GSA results to significant genes
AnchorNameFiltered GSA results

 

...

The pathway enrichment results list KEGG pathways, giving an enrichment score and p-value for each (Figure 72). 


Numbered figure captions
SubtitleTextPathway enrichment task report
AnchorNamePathway enrichment task report

Image Modified

To get a better idea about the changes in each enriched pathway, we can view an interactive KEGG pathway map.

...

The KEGG pathway map shows up-regulated genes from the input list in red and down-regulated genes from the input list in green (Figure 73). 

 


Numbered figure captions
SubtitleTextInteractive KEGG pathway map for FoxO signaling pathway
AnchorNameFoxO signaling pathway

Final pipeline

 


Numbered figure captions
SubtitleTextView of the final pipeline
AnchorNameView of the final pipeline

 


References

[1] Stoeckius, M., Hafemeister, C., Stephenson, W., Houck-Loomis, B., Chattopadhyay, P. K., Swerdlow, H., ... & Smibert, P. (2017). Simultaneous epitope and transcriptome measurement in single cells. Nature methods, 14(9), 865.

...

[3] Mimitou, E., Cheng, A., Montalbano, A., Hao, S., Stoeckius, M., Legut, M., ... & Satija, R. (2018). Expanding the CITE-seq tool-kit: Detection of proteins, transcriptomes, clonotypes and CRISPR perturbations with multiplexing, in a single assay. bioRxiv, 466466. 


Additional assistance

...

 

...