The Single-cell QA/QC task in Partek Flow enables you to visualize several measure of cell quality and filter to include only high-quality cells. To invoke Single-cell QA/QC:
If your Single cell counts data node has been annotated with a gene/transcript annotation, the task will run without a task configuration dialog. However, if you imported a single cell counts matrix without specifying an gene/transcript annotation file, you will be prompted to choose the genome assembly and annotation file by the Single cell QA/QC configuration dialog (Figure 1).
The Single cell QA/QC task report includes interactive violin plots showing the value of every cell in the project on two or three quality measures (Figure 2).
There are typically three plots: counts per cell, detected genes per cell, and the percentage of mitochondrial counts per cell. If your cells do not express any mitochondrial genes, the plot for the percentage of mitochondrial counts per cell will be absent.
Mitochondrial genes are defined as genes located on a mitochondrial chromosome in the gene annotation file. The mitochondrial chromosome is identified in the gene annotation file by having "M" or "MT" in its chromosome name. If the gene annotation file does not follow this naming convention for the mitochondrial chromosome, Partek Flow will not be able to identify any mitochondrial genes. If your single cell RNA-Seq data was processed in another program and the count matrix was imported into Partek Flow, be sure that the annotation field that matches your feature IDs was chosen during import; Partek Flow will be unable to identify any mitochondrial genes if the gene symbols in the imported single cell data and the chosen gene/feature annotation do not match.
Counts is calculated as the sum of the counts for all features in each cell from the input data node. Detected genes is calculated as the number of features in each cell with greater than zero counts. Percentage of mitochondrial counts is calculated as the sum of counts for known mitochondrial genes divided by the sum of counts for all features and multiplied by 100.
Each point on the plots is a cell. All cells from all samples are shown on the plots. The pink violins illustrate the distribution of cell values for the y-axis metric.
There are two methods for filtering cells. First, cells can be filtered by clicking and dragging to select a range on one of the plots (Figure 3)
Alternatively, the filters can be set using the text boxes below each plot (Figure 5). The minimum and maximum of the filter can be set using Counts or Percentiles for the Counts filter and Detected genes filters. For the Mitochondrial counts filter, you can set the minimum and maximum mitochondrial reads percentage. The number and percentage of cells included in the filter is listed at the bottom of the page and updates as filters are added.
It can be helpful to view the range of values for Counts and Detected genes on a log scale. To switch the y-axis of these plots to a log scale, click the checkbox at the top of the page (Figure 6)
For data sets with very many cells, it may be helpful to decrease the dot opacity to better visualize the plot density. Dot opacity can be adjusted using the slider at the top of the page (Figure 7).
A new data node, Filtered single cell counts, will be generated (Figure 8).
The Filter cells task report includes the filter criteria, lists the feature distribution statistics for each sample, and gives a breakdown of how many and what percentage of cells were excluded from each sample by each filter (Figure 9).
|