View Source

This task does not need an annotation model file, since the annotation is retrieved from the BAM file itself. The sequence names in the BAM files constitute the features with which the reads are quantified against.

This task is generally performed on reads aligned to a transcriptome, e.g when a species does not have a genome reference, and the bam files contain transcriptome information. In this case, the features for this quantification task are the reference sequence names in the input bam files.

There are two parameters in Quantify to reference (Figure 1):

Flow Documentation > Quantify to reference (Partek E/M) > quantify_to_ref_dialog.png

Min coverage: will filter out any features (sequence names) that have fewer reads across all samples than the value specified

Strict paired-end compatibility: this only affects paired end data. When it is checked, only reads that have two ends aligned to the same feature will be counted. Otherwise, reads will still be counted as exonic compatible reads even if the mate is not compatible with the feature

During quantification:

We scan through each of the BAM files and find all the transcripts that meet the minimum coverage threshold.
With those transcripts, we "create" an annotation file that has the transcript name as the sequence name and the Gene ID and the Transcript ID have the same transcript name. The start position is 1 and the end position is the length of the transcript.
Effectively, what the annotation file does is filter out the low coverage transcripts.
Since we don't know where the transcripts are in the genome, chromosome view will display only one transcript at a time (i.e., the transcript names are treated like "chromosomes").

The output data node will display a similar Task report as the Quantify to annotation model task.