View Source

What is Cell Ranger?

Cell Ranger is a set of analysis pipelines that process Chromium single cell data to align reads, generate feature-barcode matrices and perform clustering and gene expression analysis for 10X Genomics Chromium Technology[1].

Cell Ranger - ATAC in Partek Flow

The 'cellranger-atac count' pipeline from Cell Ranger ATAC v2.0[2] has been wrapped in Partek^® Flow^® as Cell Ranger - ATAC task. It takes FASTQ files from 'cellranger-atac mkfastq' and performs ATAC analysis including reads filtering and alignment, barcode counting, identification of transposase cut sites, peak and cell calling, count matrix generation. Its outputs then becomes the starting point for downstream analysis for scATAC-seq data in Flow.

Running Cell Ranger - ATAC in Flow

To run the Cell Ranger - ATAC task for scATAC-seq data in Flow, select Unaligned reads datanode, then select Cell Ranger - ATAC in the 10x Genomics section (Figure 1).

Flow Documentation > Cell Ranger - ATAC > Screen Shot 2022-02-28 at 3.44.29 PM.png

Similar to Cell Ranger - Gene Expression task, first time user will be asked to create a Reference assembly. In Partek^® Flow^®, we will use Cell Ranger ARC 2.0.0 to create reference assembly for all 10x Genomics analysis pipelines. Please refer to our Cell Ranger - Gene Expression task manual on how to build or use Reference assembly.

Flow Documentation > Cell Ranger - ATAC > Figure2.First time user_interface.png

Clicking the big grey button of Create Cell Ranger ARC 2.0.0 reference would pop up a new window where lists the requirements that users need to fill in (Figure 3). To create the same reference genomes (2020-A) that are provided in Cell Ranger by default, the transcriptome annotations are respectively GENCODE v32 for human and vM23 for mouse, which are equivalent to Ensembl 98[3]. If users don't have any options in the dropdown list, they can click Add annotation model (GTF file) for Index, or New assembly... (FASTA file)for Assembly and upload the files.

Flow Documentation > Cell Ranger - ATAC > Screen Shot 2022-01-14 at 10.07.20 AM.png

Once the right options has been chosen/provided, simply press the Create button to finish. The reference assembly of ‘Homo sapiens (human) - hg38’ has been created as an example here (Figure 4).

Flow Documentation > Cell Ranger - ATAC > Figure 4_updated.png

The main task menu will be refreshed as above (Figure 4) for gene expression data if references have been added. Users can go ahead click the Finish button to run the task as default.

While for Feature Barcode data, there are more information needed besides reference assembly. An additional section of Protein has been added to the interface if Single cell gene expression + Cell surface protein has been selected for Feature Barcode data (Figure 5). Users need firstly push the button Select data node and select the correct data for feature of antibody capture or protein in a new pop-up window (top right, Figure 5). Then users need to upload the feature reference file (.csv) prepared for their datasets. A Feature Reference CSV file declares the molecule structure and unique Feature Barcode sequence of each feature present in the experiment. It should include at least six columns: id, name, read, pattern, sequence and feature_type. An example of TotalSeq™-B Feature Reference CSV has been linked here. Users can download it by clicking the link and use it as a template for their own data. But for more details, please refer to 10x Genomics webpage[4].

Flow Documentation > Cell Ranger - ATAC > Figure 5_updated.png

A new data node named Single cell counts will be displayed in Flow if the task has been finished successfully (Figure 6). This data node contains a filtered feature barcode count matrix for gene expression data, but a unified feature-barcode matrix that contains gene expression counts alongside Feature Barcode counts for each cell barcode for Feature Barcode data. To open the task report when the task is finished, double click the output data node, or select the Task report in the Task results section after single clicking the data node. Users then will find the task report (Figure 7) is the same to the ‘Summary HTML’ from Cell Ranger output.

Flow Documentation > Cell Ranger - ATAC > Figure 6_updated.png

Cell Ranger - Gene Expression task report in Flow

Task report is sample based. Users can use the dropdown list on the top left to switch samples. Under the sample name, there are two tabs on each report - Summary report and Analysis report (Figure 7). Important information on Estimated Number of Cells, Mean Reads per Cell, Median Genes per Cell, as well as information on Sequencing, Mapping, and Sample are summarized in different panels. The Barcode Rank Plot has also been included as an important piece in the Cells panel in the Summary report (Figure 7).

Flow Documentation > Cell Ranger - ATAC > Figure 7.1.png

Another two plots -biplots of Sequencing Saturation and Median Genes per Cell to Mean Reads per Cell have been included in the Analysis report as they are important metrics to library complexity and sequencing depth (Figure 8).

Flow Documentation > Cell Ranger - ATAC > Figure 7.3.png

Details will be exhibited and the panel will be expanded correspondingly if the the icon is clicked. In the example below, the plot of Median Genes per Cell has been expanded while the Sequencing Saturation plot hasn't (Figure 9).

Flow Documentation > Cell Ranger - ATAC > Figure 9.png

Other than two additional panels summarized information for Antibody Sequencing and Antibody Application have been added, the task report for Feature Barcode data is the same to scRNA-seq data report.

Flow Documentation > Cell Ranger - ATAC > Screen Shot 2022-01-14 at 3.35.32 PM.png

Users can click Configure to change the default settings In Advanced options (Figure 4).

Include introns: Count reads mapping to intronic regions. This may improve sensitivity for samples with a significant amount of pre-mRNA molecules, such as nuclei.

Expected cells: Expected number of recovered cells. Default: 3,000 cells.

Force cells: Force pipeline to use this number of cells, bypassing the cell detection algorithm. Use this if the number of cells estimated by Cell Ranger is not consistent with the barcode rank plot.

Memory limit (GB): Restricts Cell Ranger - Gene Expression to use specified amount of memory (in GB) to execute pipeline stages.

What is Cell Ranger?

Cell Ranger - ATAC in Partek Flow

Running Cell Ranger - ATAC in Flow

Cell Ranger - Gene Expression task report in Flow

References