Next, we will filter out certain cells and re-split the data. Re-splitting the data can be useful if you want to perform differential analysis and downstream analysis separately for proteins and genes. For your own analyses, re-splitting the data is optional. You could just as well continue with differential analysis with the merged data if you prefer. 

Filter Groups

Because we have classified our cells, we can now filter based on those classifications. This can be used to focus on a single cell type for re-clustering and sub-classification or to exclude cells that are not of interest for downstream analysis.


This produces a Filtered counts data node (Figure 2).


Re-split the Matrix

This will produce two data nodes, one for each data type (Figure 3). The split data nodes will both retain cell classification information.


Differential Analysis and Visualization - Protein Data

Once we have classified our cells, we can use this information to perform comparisons between cell types or between experimental groups for a cell type. In this project, we only have a single sample, so we will compare cell types.

The first step is to choose which attributes we want to consider in the statistical test. 

Next, we will set up the comparison we want to make. Here, we will compare the Activated and Mature B cells.

The comparison should appear in the table as Activated B cells vs. Mature B cells.


The ANOVA task produces an ANOVA data node.

The report lists each feature tested, giving p-value, false discovery rate adjusted p-value (FDR step up), and fold change values for each comparison (Figure 5).


In addition to the listed information, we can access dot and violin plots for each gene or protein from this table.

This opens a dot plot in a new data viewer session, showing CD45A expression for cells in each of the classifications (Figure 6). First, we exclude Doublets and N/A cells from the plot:


We can use the Configuration panel on the left to edit this plot.


To visualize all of the proteins at the same time, we can make a hierarchical clustering heat map.

The heatmap can easily be customized using the tools on the left.







Feel free to explore the other tool options on the left to customize the plot further.

Differential Analysis, Visualization, and Pathway analysis - Gene Expression Data

We can use a similar approach to analyze the gene expression data.

The comparison should appear in the table as Activated B cells vs. Mature B cells.

As before, this will generate an ANOVA task node and n ANOVA data node.


Because more than 20,000 genes have been analyzed, it is useful to use a volcano plot to get an idea about the overall changes.

The Volcano plot opens in a new data viewer session, in a new tab in the web browser. It shows each gene as a point with cutoff lines set for P-value (y-axis) and fold-change (x-axis). By default, the P-value cutoff is set to 0.05 and the fold-change cutoff is set at |2| (Figure 14).

The plot can be configured using various tools on the left. For example, the Style icon can be used to change the appearance of the points. The X and Y-axes can be changed in the Axes icon. The Statistics icon can be used to set different Fold-change and P-value thresholds for coloring up/down-regulated genes. The in plot controls can be used to transpose the volcano plot (Figure 14). 


We can filter the full set of genes to include only the significantly different genes using the filter panel on the left.

The number at the top of the filter will update to show the number of included genes (Figure 15).


A task, Differential analysis filter, will run and generate a new Filtered Feature list data node. We can get a better idea about the biology underlying these gene expression changes using gene set or pathway enrichment. Note, you need to have the Pathway toolkit enabled to perform the next steps.

The pathway enrichment results list KEGG pathways, giving an enrichment score and p-value for each (Figure 16).


To get a better idea about the changes in each enriched pathway, we can view an interactive KEGG pathway map.

The KEGG pathway map shows up-regulated genes from the input list in red and down-regulated genes from the input list in green (Figure 17).