Partek Flow Documentation

Page tree

Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

 

Table of Contents
maxLevel2
minLevel2
excludeAdditional Assistance

The Deduplicate UMIs task identifies and removes reads with duplicate unique molecular identifiers (UMIs). The methods available for details of the UMI deduplication methods are outlined in our UMI Deduplication in Partek Flow white paper. 

...

The task configuration dialog content depends on whether you imported FASTQ files or BAM files into Partek Flow.

Configuring Deduplicate UMIs

Imported FASTQ

UMIs and barcodes are detected and recorded by the Trim tags task in Partek Flow. You can choose whether to retain only one alignment per UMI or not (Figure 1). The default will depend on which prep kit was used in the Trim tags task. 

 

Numbered figure captions
SubtitleTextUMI deduplication dialog if UMIs and barcodes were processed by Trim tags
AnchorNameUMI deduplication dialog from FASTQ

Image Added

If you select Retain only one alignment per UMI, you will be asked to choose an assembly and gene/feature annotation file. The annotation file is used to check whether a read overlaps an exonic region. Only reads that have 50% overlap with an exon will be retained. 

If you do not select Retain only one alignment per UMI, UMI deduplication will proceed without filtering to exonic reads. Other differences between the two options are outlined in the UMI Deduplication in Partek Flow white paper. 

Imported BAM

UMIs and barcodes are stored in the BAM header. Additional options are available in the task configuration dialog to allow you to specify the location of the UMI and barcode information in the BAM header. Specify the BAM header tags in the text fields. For example, when processing a BAM file produced by CellRanger 3.0.1, the BAM identifier tag for the UMI sequence is UR and the BAM identifier for the barcode sequence is CR (Figure 2). 

 

Numbered figure captions
SubtitleTextSpecify the location of the BAM UMI and barcode tags
AnchorNameDeduplicate UMI dialog from imported BAM

Image Added

The option to Retain only one alignment per UMI is also available. 

Deduplicate UMIs task report

The Deduplicate UMIs task report includes a knee plot showing the number of deduplicated reads per barcode. This plot is used to filter the barcodes to include only barcodes corresponding to cells. For more information about using the knee plot to filter barcodes, please see the Cell Barcode QA/QC page. One difference between the Deduplication report and the Cell Barcode QA/QC report is that the Deduplication report gives the number of initial alignments and the number of deduplicated alignments for each sample (Figure 3). This indicates how many of your aligned reads were PCR duplicates and how many were unique molecules. 

The initial number of cells is set by our automatic filter. You can set the filter manually by clicking on the plot or by typing a cutoff number in the Cells or Reads in cells text boxes. If there are multiple samples, each sample receives a plot and filters are set per sample. 

 

Numbered figure captions
SubtitleTextThe deduplication report shows the number of UMIs per barcode
AnchorNameDeduplication report

Image Added

The number of cells, reads in cells, median reads per cell, number of initial alignments, and number of deduplicated alignments are listed for each sample in the summary table (Figure 4).

 

Numbered figure captions
SubtitleTextDeduplication report summary table
AnchorNameDeduplication report summary table

Image Added

Clicking Apply filter at either the knee plot or the summary table will run the Filter barcodes task and generate a Filtered reads data node.

To return to the knee plot, click Back to filter. 

To reset the filters for all sample to the automatic cutoff, click Reset all filters

 

Additional assistance

 

Rate Macro
allowUsersfalse