Table of Contents |
---|
maxLevel | 2 |
---|
minLevel | 2 |
---|
exclude | Additional Assistance |
---|
|
Next, we will filter out certain cells and re-split the data. Re-splitting the data can be useful if you want to perform differential analysis and downstream analysis separately for proteins and genes. For your own analyses, re-splitting the data is optional. You could just as well continue with differential analysis with the merged data if you prefer.
Filter Groups
Because we have classified our cells, we can now filter based on those classifications. This can be used to focus on a single cell type for re-clustering and sub-classification or to exclude cells that are not of interest for downstream analysis.
- Click the Classified result data node
- Click Filtering
- Click Filter groups
- Set to exclude Cell type is Doublets using the drop-down menus
- Click AND
- Set the second filter to exclude Cell type is N/A using the drop-down menus
- Click Finish to apply the filter (Figure 1)
Numbered figure captions |
---|
SubtitleText | Set up the Filter groups task to exlcude Doublets and cells that are not classified |
---|
AnchorName | Filter groups |
---|
|
![](/download/attachments/19333489/CITE-Seq_filter_groups.png?version=1&modificationDate=1593698001592&api=v2)
|
This produces a Filtered counts data node (Figure 2).
Numbered figure captions |
---|
SubtitleText | Filter groups output |
---|
AnchorName | Filtered counts |
---|
|
![](/download/attachments/19333489/Filtered_counts_output.png?version=1&modificationDate=1593698260778&api=v2)
|
Re-split the Matrix
- Click the Filtered counts data node
- Click Pre-analysis tools
- Click Split by feature type
This will produce two data nodes, one for each data type (Figure 3). The split data nodes will both retain cell classification information.
Numbered figure captions |
---|
SubtitleText | It is possible to re-split the merged matrix once again |
---|
AnchorName | Re-split the matrix |
---|
|
![](/download/attachments/19333489/Re_split_data.png?version=1&modificationDate=1593781108231&api=v2)
|
Differential Analysis and Visualization - Protein Data
Once we have classified our cells, we can use this information to perform comparisons between cell types or between experimental groups for a cell type. In this project, we only have a single sample, so we will compare cell types.
- Click the Antibody Capture data node
- Click Differential analysis
- Click GSA
The first step is to choose which attributes we want to consider in the statistical test.
- Check Cell type to include it in the statistical test
- Click Next
Next, we will set up the comparison we want to make. Here, we will compare the Activated and Mature B cells.
- Check Activated B cells in the top panel
- Check Mature B cells in the bottom panel
- Click Add comparison
The comparison should appear in the table as Activated B cells vs. Mature B cells.
- Click Finish to run the statistical test (Figure 4)
Numbered figure captions |
---|
SubtitleText | Setting up a comparison for differentially expressed proteins |
---|
AnchorName | CITE-Seq GSA task set up |
---|
|
![](/download/attachments/19333489/CITE-Seq_GSA_protein_comparison.png?version=1&modificationDate=1593781325790&api=v2)
|
The GSA task produces a GSA data node.
- Double-click the GSA data node to open the task report
The report lists each feature tested, giving p-value, false discovery rate adjusted p-value (FDR step up), and fold change values for each comparison (Figure 5).
Numbered figure captions |
---|
SubtitleText | GSA report for protein expression data |
---|
AnchorName | GSA protein result |
---|
|
![](/download/attachments/19333489/GSA_protein_result.png?version=1&modificationDate=1594290769929&api=v2)
|
In addition to the listed information, we can access dot and violin plots for each gene or protein from this table.
- Click
in the CD45RA_TotalSeqB row
This opens a dot plot in a new data viewer session, showing CD45A expression for cells in each of the classifications (Figure 6).
Numbered figure captions |
---|
SubtitleText | CD45RA dot plot for all cells |
---|
AnchorName | CD45RA dot plot |
---|
|
![](/download/attachments/19333489/CD45RA_dot_plot.png?version=1&modificationDate=1594290953380&api=v2)
|
We can use the Configuration panel on the left to edit this plot.
- Expand the Summary card
- Switch on Violins
- Switch on Overlay
- Switch on Colored
- Expand the Data card
- Use the slider to increase the Jitter
- Expand the Color card
- Use the slider to decrease the Opacity (Figure 7)
Numbered figure captions |
---|
SubtitleText | Use the Configuration panel to configure the dot plot |
---|
AnchorName | Configure dot plot |
---|
|
![](/download/attachments/19333489/CD45RA_dot_plot_edited.png?version=1&modificationDate=1594291068495&api=v2)
|
- Click the project name to return to the Analyses tab
To visualize all of the proteins at the same time, we can make a hierarchical clustering heat map.
- Click the GSA data node
- Click Exploratory analysis in the toolbox
- Click Hierarchical clustering/heat map
- In the Ordering section, choose Cell type from the Sample order drop-down list
- Click Finish to run with the other default settings
- Double-click the Hierarchical clustering task node to open the heat map (Figure 8)
Numbered figure captions |
---|
SubtitleText | Heatmap showing expression of protein markers before configuration |
---|
AnchorName | Heatmap of proteins |
---|
|
![](/download/attachments/19333489/Protein_heatmap_before_configuration.png?version=3&modificationDate=1615566027000&api=v2)
|
The heat map can easily be customized using the Configuration card on the left.
- In the Layout section, expand the Axis titles card
- Disable the Row labels
- Activate the Transpose switch (Figure 9)
Numbered figure captions |
---|
SubtitleText | Configure the layout section in the configuration card |
---|
AnchorName | Layout card |
---|
|
![](/download/attachments/19333489/Configure_layout_card.png?version=1&modificationDate=1615566539631&api=v2)
|
- In the Annotations section, expand the Data card
- Click the grey circle and choose Merged counts as the data source
- Choose Cell type from the drop-down menu (Figure 10)
Numbered figure captions |
---|
SubtitleText | Configure the Annotations section in the Configuration card |
---|
AnchorName | Annotations card |
---|
|
![](/download/attachments/19333489/Configure_annotations_card.png?version=1&modificationDate=1615567242308&api=v2)
|
- In the Heatmap section, expand the Range card
- Set the Min and Max to -1.2 and 1.2, respectively (Figure 11)
Numbered figure captions |
---|
SubtitleText | Configure the Heatmap section of the Configuration card |
---|
AnchorName | Heatmap card |
---|
|
![](/download/attachments/19333489/Configure_heatmap_card.png?version=1&modificationDate=1615567633470&api=v2)
|
Feel free to explore the other options in the Configuration card on the left to customize the plot further (Figure 12).
Numbered figure captions |
---|
SubtitleText | Heatmap showing expression of protein markers after configuration. Use the configuration card on the left to customize the plot further |
---|
AnchorName | Heatmap of proteins configured |
---|
|
![](/download/attachments/19333489/Protein_heatmap_after_configuration.png?version=2&modificationDate=1615567806791&api=v2)
|
Differential Analysis, Visualization, and Pathway analysis - Gene Expression Data
We can use a similar approach to analyze the gene expression data.
- Click the project name to return to the Analyses tab
- Click the Gene Expression data node
- Click Differential analysis
- Click GSA
- Check Cell type to include it in the statistical test
- Click Next
- Check Activated B cells in the top panel
- Check Mature B cells in the bottom panel
- Click Add comparison
- Click Finish to run the statistical test
As before, this will generate a GSA task node and a GSA data node.
- Double-click the GSA task node to open the task report (Figure 13)
Numbered figure captions |
---|
SubtitleText | GSA report for the gene expression data |
---|
AnchorName | GSA genes result |
---|
|
![](/download/attachments/19333489/GSA_genes_list.png?version=1&modificationDate=1594291417090&api=v2)
|
Because more than 20,000 genes have been analyzed, it is useful to use a volcano plot to get an idea about the overall changes.
- Click
in the top right corner of the table to open a volcano plot
The Volcano plot opens in a new data viewer session, in a new tab in the web browser. It shows each gene as a point with cutoff lines set for P-value (y-axis) and fold-change (x-axis). By default, the P-value cutoff is set to 0.05 and the fold-change cutoff is set at |2| (Figure 14).
The plot can be configured using various options in the Configuration card on the left. For example, the Color, Size and Shape cards can be used to change the appearance of the points. The X and Y-axes can be changed in the Data card. The Significance card can be used to set different Fold-change and P-value thresholds for coloring up/down-regulated genes.
Numbered figure captions |
---|
SubtitleText | The volcano plot can be configured using various options in the Configuration and Selection cards |
---|
AnchorName | Volcano plot gene expression |
---|
|
![](/download/attachments/19333489/Volcano_plot_CITE-Seq.png?version=1&modificationDate=1594291498777&api=v2)
|
- Click the GSA report tab in your web browser to return to the full report
We can filter the full set of genes to include only the significantly different genes using the filter panel on the left.
- Click FDR step up
- Type 0.05 for the cutoff and press Enter on your keyboard
- Click Fold change
- Set to From -2 to 2 and press Enter on your keyboard
The number at the top of the filter will update to show the number of included genes (Figure 15).
Numbered figure captions |
---|
SubtitleText | Use the panel on the left to filter the list for significant genes |
---|
AnchorName | Significant genes |
---|
|
![](/download/attachments/19333489/GSA_genes_significant_list.png?version=1&modificationDate=1594291602936&api=v2)
|
- Click
to create a new data node including only these significantly different genes
A task, Differential analysis filter, will run and generate a new Filtered Feature list data node. We can get a better idea about the biology underlying these gene expression changes using gene set or pathway enrichment. Note, you need to have the Pathway toolkit enabled to perform the next steps.
- Click the Filtered feature list data node
- Click Biological interpretation in the toolbox
- Click Pathway enrichment
- Make sure that Homo sapiens is selected in the Species drop-down menu
- Click Finish to run
- Double-click the Pathway enrichment task node to open the task report
The pathway enrichment results list KEGG pathways, giving an enrichment score and p-value for each (Figure 16).
Numbered figure captions |
---|
SubtitleText | Results of pathway enrichment test |
---|
AnchorName | Pathway enrichment analysis results |
---|
|
![](/download/attachments/19333489/Pathway_enrichment_results_CITE-Seq.png?version=1&modificationDate=1593786966461&api=v2)
|
To get a better idea about the changes in each enriched pathway, we can view an interactive KEGG pathway map.
- Click path:hsa05202 in the Transcriptional misregulation in cancer row
The KEGG pathway map shows up-regulated genes from the input list in red and down-regulated genes from the input list in green (Figure 17).
Numbered figure captions |
---|
SubtitleText | Transcriptional misregulation in cancer pathway with significant genes highlighted in green and red |
---|
AnchorName | Transcriptional misregulation in cancer |
---|
|
![](/download/attachments/19333489/Transcriptional_misregulation_in_cancer.png?version=1&modificationDate=1594291772676&api=v2)
|
Numbered figure captions |
---|
SubtitleText | Final CITE-Seq pipeline |
---|
AnchorName | CITE-Seq final pipeline |
---|
|
![](/download/attachments/19333489/CITE-Seq_fInal_pipeline.png?version=1&modificationDate=1593787802862&api=v2)
|