t-SNE (t-distributed stochastic neighbor embedding) is a visualization method commonly used to analyze single-cell RNA-Seq data. Each cell is shown as a point on the plot and each cell is positioned so that it is close to cells with similar overall gene expression. When working with multiple samples, a t-SNE plot can be drawn for each sample or all samples can be combined into a single plot. Viewing samples individually is the default in Partek® Flow® because sample to sample variation and outlier samples can obscure cell type differences if all samples are plotted together. However, as you will see in this tutorial, in some data sets, cell type differences can be visualized even when samples are combined.

Using the t-SNE plot, cells can be classified based on clustering results and differences in expression of key marker genes. 

Multiple single-sample t-SNE plots

Prior to performing t-SNE, it is a good idea to reduce the dimensionality of the data using principal components analysis (PCA).


Note, the default settings include the Split by sample checkbox being selected. This means that the dimensionality reduction will be performed on each sample separately.


PCA task and data nodes will be generated.



Because the upstream PCA task was performed separately for each sample, the t-SNE task will also be performed separately for each sample. t-SNE task and data nodes will be generated (Figure 5).


Once the t-SNE task has completed, we can view the t-SNE plots

The t-SNE will open in a new data viewer session. The t-SNE plot for the first sample in the data set, MGH36 (Figure 6), will open on the canvas. Please note that the appearance of the t-SNE plot may differ each time it is drawn so your t-SNE plots may look different than those shown in this tutorial. However, the cell-to-cell relationships indicated will be the same. 


The t-SNE plot is in 3D by default. To change the default, click your avatar in the top right > Settings > My Preferences and edit your graphics preferences and change the default scatter plot format from 3D to 2D. 

You can rotate the 3D plot by left-clicking and dragging your mouse. You can zoom in and out using your mouse wheel. The 2D t-SNE is also calculated and you can switch between the 2D and 3D plots on the canvas. We will do this later on in the tutorial.

Each sample has its own plot. We can switch between samples. 

The t-SNE plot has switched to show the next sample, MGH42 (Figure 7).


The goal of this analysis is to compare malignant cells from two different glioma subtypes, astrocytoma and oligodendroglioma. To do this, we need to identify the malignant cells we want to include and which cells are the normal cells we want to exclude. 

The t-SNE plot in Partek Flow offers several options for identifying, selecting, and classifying cells. In this tutorial, we will use the expression of known marker genes to identify cell types. 

To visualize the expression of a marker gene, we can color cells on the t-SNE plot by their expression level. 


The cells will be colored from black to green based on their expression level of BCAN, with cells expressing higher levels more green (Figure 9). BCAN is highly expressed in glioma cells. 


In Partek Flow, we can color cells by more than one gene. We will now add a second glioma marker gene, GPM6A. 

Cells expressing GPM6A are now colored red and cells expressing BCAN are colored green. Cells expressing both genes are colored yellow, while cells expressing neither are colored black (Figure 10).


Numerical expression levels for each gene can be viewed for individual cells. 

The expression level for that cell is displayed on the legend for each gene. Expression values can also be viewed by mousing over a cell (Figure 11).


Now that cells are colored by the expression of two glioma cell markers, we can classify any cell that expresses these genes as glioma cells. Because t-SNE groups cells that are similar across the high-dimensional gene expression data, we will consider cells that form a group where the majority of cells express BCAN and/or GPM6A as the same cell type, even if they do not express either marker gene.


Selected cells are shown in bold and unselected cells are dimmed. The number of selected cells is indicated in the figure legend. The cells are plotted on the color scale depending on their relative expression levels of the two marker genes (Figure 13)


A dialog to give the classification a name will appear.


Once cells have been classified, the classification is added to Classify. The number of cells belonging to the classification is listed. In MGH42, there are 460 glioma cells (Figure 15). 


Classifications made on the t-SNE plot are retained as a draft as part of the data viewer session. In this tutorial, we will classify malignant cells for each sample before we save and apply the classifications, but if necessary, you can save the data viewer session by clicking the  Save icon on the left to retain all of the formatting and draft classifications. The data viewer session will be stored under the Data viewer tab and can be re-opened to continue making classifications at a later time. 



There should be 5,322 glioma cells in total across all 8 samples.  


With the malignant cells in every sample classified, it is time to save the classifications.


The new attribute is stored in the Data tab and is available to any node in the project.


One multi-sample t-SNE plot

For some data sets, cell types can be distinguished when all samples can be visualized together on one t-SNE plot. We will use a t-SNE plot of all samples to classify glioma, microglia, and oligodendrocyte cell types. 


The PCA task will run as a new green layer.

The t-SNE task will be added to the green layer (Figure 23). Layers are created in Partek Flow when the same task is run on the same data node. 


Once the task has completed, we can view the plot.



Viewing the 2D t-SNE plot, while most cells cluster by sample, there are a few clusters with cells from multiple samples (Figure 26).


Using marker genes, BCAN (glioma), CD14 (microglia), and MAG (oligodendrocytes), we can assess whether these multi-sample clusters belong to our known cell types. 

After coloring by these marker genes, three cell populations are clearly visible (Figure 27). 


The red cells are CD14 positive, indicating that they are the microglia from every sample. 


The blue cells are MAG positive, indicating that they are the oligodendrocytes from every sample. 

Finally, we will classify the BCAN expressing cells on the plot as glioma cells from every sample.

The number of cells classified as microglia, oligodendrocytes, and glioma are shown in Classify (Figure 29)




The new attribute is now available for downstream analysis.