Partek Flow Documentation

Page tree

Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

In RNA-seq data analysis, after alignment, the most common step is to estimate gene or/and transcript expression abundance, the expression level is represented by read counts. There are three options in this step:

Table of Contents
maxLevel2
excludeAdditional Assistance

The three options will be discussed below.

Quantify to annotation model (Partek E/M)

When the reads are aligned to a genome reference, e.g. hg38, the quantification is performed on transcriptome, you need to provide the annotation model file of the transcriptome.

Quantification dialog

If the alignment was generated in Partek® Flow®, the genome assembly will be displayed as text on the top of the page (Figure 1), you do not have the option to change the reference.

...

In the annotation file, there might be multiple features in the same location, or one read might have multiple alignments, so the read count of a feature might not be an integer. Our white paper on the Partek E/M algorithm has more details on Partek’s implementation the E/M algorithm initially described by Xing et al. [2]

Quantify to annotation model (Partek E/M) output

Depending on the annotation file, the output could be one or two data nodes. If the annotation file only contains one level of information, e.g. miRNA annotation file, you will only get one output data node. On the other hand, if the annotation file contains gene level and transcript level information, such as those from the Ensembl database, both gene and transcript level data nodes will be generated. If two nodes are generated, the Task report will also contain two tabs, reporting quantification results from each node. Each report has two tables. The first one is a summary table displaying the coverage information for each sample quantified against the specified transcriptome annotation (Figure 4).

...

Numbered figure captions
SubtitleTextDownload quantification output data dialog: data can be downloaded in two types of format: Partek Genomics Suite project format or text file format
AnchorNameDownload data

 

 

Quantify to transcriptome (Cufflinks)

Cufflinks assembles transcripts and estimates transcript abundances on aligned reads. Implementation details are explained in Trapnell et al. [1]

...

When the Use bias correction check box is selected, it will use the genome sequence information to look for overrepresented sequences and improve the accuracy of transcript abundance estimates.

Quantify to reference (Partek E/M)

This task does not need an annotation model file, since the annotation is retrieved from the BAM file itself. The sequence names in the BAM files constitute the features with which the reads are quantified against.

...

The output data node will display a similar Task report as the Quantify to annotation model task.

References

  1. Trapnell C, Williams B, Pertea G, et al. Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation. Nature Biotech. 2010; 28:511-515.
  2. Xing Y, Yu T, Wu YN, Roy M, Kim J, Lee C. An expectation-maximization algorithm for probabilistic reconstructions of full-length isoforms from splice graphs. Nucleic Acids Res. 2006; 34(10):3150-60.


...