Page History
Table of Contents |
---|
This tutorial presents an outline of the basic series of steps for analyzing a 10x Genomics Gene Expression with Feature Barcoding (antibody) data set in Partek Flow starting with the output of Cell Ranger.
...
A rectangle, or task node, will be created for Split matrix along with two output circles, or data nodes, one for each data type (Figure 2). The labels for these data types are determined by features.csv file used when processing the data with Cell Ranger. Here, our data is labeled Gene Expression, for the mRNA data, and Antibody Capture, for the protein data.
Numbered figure captions | ||||
---|---|---|---|---|
| ||||
...
This produces a Single-cell QA/QC task node (Figure 4).
Numbered figure captions | ||||
---|---|---|---|---|
| ||||
...
The output is a Filtered single cell counts data node (Figure 6).
Numbered figure captions | ||||
---|---|---|---|---|
| ||||
...
This produces a Single-cell QA/QC task node (Figure 7).
Numbered figure captions | ||||
---|---|---|---|---|
| ||||
...
The output is a Filtered single cell counts data node (Figure 9).
Numbered figure captions | ||||
---|---|---|---|---|
| ||||
...
Normalization produces a Normalized counts data node on the Gene Expression branch of the pipeline (Figure 12).
Numbered figure captions | ||||
---|---|---|---|---|
| ||||
|
...
Data nodes that can be merged with the Antibody Capture branch Normalized counts data node are shown in color (Figure 13).
Numbered figure captions | ||||
---|---|---|---|---|
| ||||
...
The output is a Merged counts data node (Figure 14). This data node will include the normalized counts of our protein and mRNA data. The intersection of cells from the two input data nodes is retained so only cells that passed the quality filter for both protein and mRNA data will be included in the Merged counts data node.
Numbered figure captions | ||||
---|---|---|---|---|
| ||||
Collapsing tasks to simplify the pipeline
...
Tasks that can for the beginning and end of the collapsed section of the pipeline are highlighted in purple (Figure 16). We have chosen the Split matrix task as the start and we can choose Merge matrices as the end of the collapsed section.
Numbered figure captions | ||||
---|---|---|---|---|
| ||||
...
- Name the Collapsed task Data processing
- Click Save (Figure 17)
Numbered figure captions | ||||
---|---|---|---|---|
| ||||
The new collapsed task, Data processing, appears as a single rectangle on the task graph (Figure 18).
Numbered figure captions | ||||
---|---|---|---|---|
| ||||
...
When expanded, the collapsed task is shown as a shaded section of the pipeline with a title bar (Figure 19).
Numbered figure captions | ||||
---|---|---|---|---|
| ||||
...
- Click the Merged counts data node
- Click Exploratory analysis in the toolbox
- Click Scatter plot
- Click Finish to run
- Double-click the Scatter plot task node to open it
- Click 2D to switch to a 2D plot style (Figure 1920)
Numbered figure captions | ||||
---|---|---|---|---|
| ||||
...
- Click the Features tab in the Selection / Filtering section of the control panel
- Type CD3 in the ID search bar of the Features tab
- Click CD3_TotalSeqB in the drop-down (Figure 2021)
Numbered figure captions | ||||
---|---|---|---|---|
| ||||
- Click to add a filter for CD3 protein expression
- Set the CD3_TotalSeqB filter to <to = 2 > 2
This will select any cell with < => 2 normalized count for CD3 protein. Selected cells are shown in bold on the plot and, because we have CD3_TotalSeqB on one of our axes, the cut-off point chosen can be easily evaluated (Figure 2122).
Numbered figure captions | ||||
---|---|---|---|---|
| ||||
...
The x-axis now shows CD8a protein expression (Figure 2223).
Numbered figure captions | ||||
---|---|---|---|---|
| ||||
...
- Type CD4 in the ID search bar of the Features tab
- Click CD4_TotalSeqB in the drop-down
- Click to add a filter for CD4 protein expression
- Set the CD4_TotalSeqB filter to <to = 2 > 2
- Type CD8a in the ID search bar of the Features tab
- Click CD8a_TotalSeqB in the drop-down
- Click to add a filter for CD8a protein expression
- Set the CD8a_TotalSeqB filter to < 2
This will select the cells in the upper left-hand section of the plot (Figure 2324).
Numbered figure captions | ||||
---|---|---|---|---|
| ||||
...
This selects the cells in the lower right-hand section of the plot (Figure 2425).
Numbered figure captions | ||||
---|---|---|---|---|
| ||||
...
- Click Clear selection
- Choose Classifications from the Color by drop-down menu (Figure 2526).
Numbered figure captions | ||||
---|---|---|---|---|
| ||||
...
Each point on the plot is a cell and the cells are colored by their cluster assignments (Figure 2627).
Numbered figure captions | ||||
---|---|---|---|---|
| ||||
...
- Choose Expression from the Color by drop-down menu
- Type CD4 in the search box and choose CD4_TotalSeqB from the drop-down (Figure 2728)
Numbered figure captions | ||||
---|---|---|---|---|
| ||||
Cells that express high levels of CD4 are colored blue on the plot (Figure 2829).
Numbered figure captions | ||||
---|---|---|---|---|
| ||||
...
- Click to activate the lasso tool
- Draw a lasso around the large blue group of cells at the bottom right of the plot to select them (Figure 2930)
Numbered figure captions | ||||
---|---|---|---|---|
| ||||
- Click to filter to include only the selected cells
- Click to rescale the axes to the included cells
...
- Choose Graph-based from the Color by drop-down menu (Figure 3031)
Numbered figure captions | ||||
---|---|---|---|---|
| ||||
Again, the colors here indicate the cluster assignment for each cell. Because we ran clustering using only the protein expression data, the cluster assignments are based on each cells protein expression data. To help identify which cell types the clusters correspond to, we generate a group biomarkers table with every clustering result. Biomarkers are genes or proteins that are expressed highly in a clusters when compared with the other clusters. While the clustering was calculated using only the protein expression data, the biomarkers are drawn from both gene and protein expression data.
The far-left right cluster, cluster 8, has several interesting biomarkers. The top biomarker, is CXCL13, a gene expressed by follicular B helper T cells (Tfh cells). Two of the other biomarkers are PD-1 protein, which is expressed in Tfh cells, promotes self-tolerance, and is a target for immunotherapy drugs; and TIGIT protein, another immunotherapy drug target that promotes self-tolerance.
...
PD-1 expression is highest in cluster 8 with high expression throughout the cluster (Figure 3132).
Numbered figure captions | ||||
---|---|---|---|---|
| ||||
...
It is interesting to note that this pattern of PD-1 expression is not easily discernible at the PDCD1 gene expression level (Figure 3233).
Numbered figure captions | ||||
---|---|---|---|---|
| ||||
...
The Tfh cell marker, CXCL13, is highly and specifically expressed in cluster 8 (Figure 3334), so we will classify the cells from cluster 8 as Tfh cells.
Numbered figure captions | ||||
---|---|---|---|---|
| ||||
- Choose Graph-based from the Color by drop-down menu
- Choose Graph-based from the Select by drop-down in the Attributes tab of the Selection / Filtering section of the control panel (Figure 3435)
Numbered figure captions | ||||
---|---|---|---|---|
| ||||
- Click the check box for 8 to select cluster 8 (Figure 3536)
Numbered figure captions | ||||
---|---|---|---|---|
| ||||
- Click Classify selection
- Name the cells Tfh cells
- Click Save
...
- Choose Classifications from the Color by drop-down menu (Figure 3637)
Numbered figure captions | ||||
---|---|---|---|---|
| ||||
To apply the classification so that it would be available in downstream tasks like differential analysis, we would click Apply classifications. Classifications that are not applied are not available in downstream analysis tasks, but are saved in a draft state on the task report where they were created. Here, we will not save the classification, but we will see how to do this later in the tutorial.
...
- Click Apply
- Click Finish to run (Figure 3738)
Numbered figure captions | ||||
---|---|---|---|---|
| ||||
...
- Mouse over the Scree plot to identify the point where additional PCs offer little additional information (Figure 3839)
Numbered figure captions | ||||
---|---|---|---|---|
| ||||
...
- Click the Merged counts data node
- Click Exploratory analysis in the toolbox
- Click Graph-based clustering
- Click Gene Expression for Include features where "Feature type" is
- Click Configure to access the advanced settings
- Set Number of principal components to 15
- Click Apply
- Click Finish to run (Figure 3940)
Numbered figure captions | ||||
---|---|---|---|---|
| ||||
...
The UMAP task report includes a scatter plot with the clustering results coloring the points (Figure 41).
Numbered figure captions | ||||
---|---|---|---|---|
| ||||
...
- Choose Expression from the Color by drop-down menu
- Type NKG7 in the search box and choose NKG7 from the drop-down (Figure 45)
Numbered figure captions | ||||
---|---|---|---|---|
| ||||
This will color the plot by NKG7 gene expression, a marker for cytotoxic cells. We can color by two T cell protein markers to distinguish cytotoxic T cells from helper T cells.
...
By default, any cell that expresses >= 1 normalized count of NKG7 is now selected (Figure 48).
Numbered figure captions | ||||
---|---|---|---|---|
| ||||
...
We have now selected only cells that express >= 1 normalized count for NKG7 gene and CD3 protein, but also have <= 2 normalized count for CD4 protein (Figure 49).
Numbered figure captions | ||||
---|---|---|---|---|
| ||||
...
We have now selected the CD4 positive, CD3 positive, NKG7 negative helper T cells (Figure 50).
Numbered figure captions | ||||
---|---|---|---|---|
| ||||
...
The zoom level will also be reset (Figure 52).
Numbered figure captions | ||||
---|---|---|---|---|
| ||||
...
There are several clusters that show high levels of CD19 protein expression (Figure 53). We can filter to these cells to examine them more closely.
Numbered figure captions | ||||
---|---|---|---|---|
| ||||
...
- Choose Graph-based from the Color by drop-down menu (Figure 55)
Numbered figure captions | ||||
---|---|---|---|---|
| ||||
...
This will color the plot by IGHD and IGHA1 (Figure 57).
Numbered figure captions | ||||
---|---|---|---|---|
| ||||
...
This produces a Filtered groups data node (Figure 62).
Numbered figure captions | ||||
---|---|---|---|---|
| ||||
...
This will produce two data nodes, one for each data type (Figure 63).
Numbered figure captions | ||||
---|---|---|---|---|
| ||||
...
The report lists each feature tested, giving p-value, false discovery rate adjusted p-value (FDR step up), and fold change values for each comparison (Figure 65).
Numbered figure captions | ||||
---|---|---|---|---|
| ||||
...
This opens a violin plot showing CD25 expression for cells in each of the classifications (Figure 66).
Numbered figure captions | ||||
---|---|---|---|---|
| ||||
...
This generates a customized heat map to illustrate how the cell types differ in their protein expression (Figure 68).
Numbered figure captions | ||||
---|---|---|---|---|
| ||||
Gene expression
We can use a similar approach to analyze the gene expression data.
...
Each gene is shown as a point on the plot with cut-off lines for fold change and p-value or FDR step up set using the control panel on the left (Figure 70). The number of genes up and down regulated according to the cut-offs is listed at the bottom of the plot. Mousing over a point shows the gene name and other information.
Numbered figure captions | ||||
---|---|---|---|---|
| ||||
...
The number at the top of the filter will update to show the number of included genes (Figure 71).
Numbered figure captions | ||||
---|---|---|---|---|
| ||||
|
...
The pathway enrichment results list KEGG pathways, giving an enrichment score and p-value for each (Figure 72).
Numbered figure captions | ||||
---|---|---|---|---|
| ||||
To get a better idea about the changes in each enriched pathway, we can view an interactive KEGG pathway map.
...
The KEGG pathway map shows up-regulated genes from the input list in red and down-regulated genes from the input list in green (Figure 73).
Numbered figure captions | ||||
---|---|---|---|---|
| ||||
Final pipeline
Numbered figure captions | ||||
---|---|---|---|---|
| ||||
References
[1] Stoeckius, M., Hafemeister, C., Stephenson, W., Houck-Loomis, B., Chattopadhyay, P. K., Swerdlow, H., ... & Smibert, P. (2017). Simultaneous epitope and transcriptome measurement in single cells. Nature methods, 14(9), 865.
...
[3] Mimitou, E., Cheng, A., Montalbano, A., Hao, S., Stoeckius, M., Legut, M., ... & Satija, R. (2018). Expanding the CITE-seq tool-kit: Detection of proteins, transcriptomes, clonotypes and CRISPR perturbations with multiplexing, in a single assay. bioRxiv, 466466.
Additional assistance |
---|
...
...