Partek Flow Documentation

Page tree

Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

The reads are first aligned to the genome.  The unaligned reads resulting from this initial alignment are split into multiple 25 bp sequences which are, in turn, aligned to the genome by Bowtie. The TopHat-Fusion algorithm identifies the cases where the first and the last 25 bp segments are aligned to either two different chromosomes or two locations on the same chromosome (spacing is defined by the user). The whole read is used to identify a fusion point. After the initial fusion candidates are defined, all the segments from the initially unaligned reads are realigned against the fusion points (as well as intron boundaries and indels).  The resulting alignments are combined with the full read alignments.

The most up-to-date TopHat-Fusion version implemented in Partek® Flow® when the manual was written (2.1.0) focuses on fusions due to chromosomal rearrangements, while fusions resulting from read-through transcription or trans-splicing were not supported. For details as well as discussion of TopHat-Fusion options, see TopHat-Fusion home page (4).

...

The resulting Fusion report task node (Figure 6) can be double-clicked to reveal the full table (Figure 7).

...

STAR Algorithm

General Overview

The STAR aligner also has the ability to detect fusion genes (referred to as “chimeric alignments”) (5,6). During the first phase of alignment, STAR searches for maximal mappable prefixes (seeds) of sequencing reads. In the second phase, all the seeds that align within user-defined genomic windows are stitched together. If an alignment within one genomic window does not cover the entire read sequence, STAR will try to find two or more windows that cover the entire read. This essentially results in the detection of fusion events, with different parts of reads aligning to distal genomic locations, or different chromosomes, or different strands.

...

STAR

...

STAR fusion detection algorithm is integrated with is performed in two steps: chimeric alignment of reads with the STAR aligner and fusion detection with STAR-Fusion. Performing fusion detection in two steps is equivalent to running the analysis in "Kickstart" mode, as described by the authors of STAR-Fusion. We recommend using STAR version 2.7.8a (see Task management to check which version you are running).

Running STAR Chimeric Alignment within Partek Flow

When performing an alignment with STAR, chimeric alignment can be activated by tick-marking the Chimeric alignment option in the Advanced options of the aligner (the Advanced options dialog is reached via the Configure link in the setup dialog). As soon as When the Chimeric alignment checkbox is selected, additional options , specific to the fusion search algorithm , are shown (Figure 11). For a discussion on the details of the options details, see STAR documentation.

...

Numbered figure captions
SubtitleTextControls of the STAR fusion gene detection algorithm (aligner defaults are shown)
AnchorNameSTAR controls

Image Modified

The output is associated with the Chimeric results data node (Figure 12), which is a part of the STAR results ( in addition to Aligned reads node and, optionally, Unaligned reads node).


Numbered figure captions
SubtitleTextChimeric results node as a result of STAR’s chimeric alignment algorithm
AnchorNameChimeric results node

Selecting the To obtain a .fusion file that summarizes the chimeric reads across samples, select the Chimeric results data node opens the task menu and click Download data in the toolbox (Figure 13) with twooptions: Data summary report or Download data.. The file is human-readable and can be opened in a text editor (example in Figure 14). For details refer to STAR's documentation.


Numbered figure captions
SubtitleTextChimeric results section of the toolbox, invokable on STAR’s chimeric alignment results (data size is an example)
AnchorNameVariant detection options

...


Numbered figure captions
SubtitleTextSTAR's .fusion file opened in a text editor (example)
AnchorNamefusion file

Image Added

Running STAR-Fusion on Chimeric results

STAR-Fusion v1.10 is wrapped into Partek Flow. STAR-Fusion will process the chimeric output generated by the STAR aligner to map junction reads and spanning reads to a reference annotation set. To run fusion detection, select the Chimeric results data node and choose STAR-Fusion from the Variant analysis menu in the toolbox (Figure

...

15).

...


Numbered figure captions
SubtitleTextChoose STAR-Fusion from the menu
AnchorNameSTAR-Fusion

Image Added

Choose the STAR-Fusion annotation from the drop-down list. We provide automatic downloads of the plug-n-play libraries distributed by Trinity Cancer Transcriptome Analysis Toolkit (CTAT)  for Human hg38 (Gencode v22 and v37) and hg19 (Gencode v19) assemblies (Figure 15). 


Image Removed
Numbered figure captions
SubtitleTextSTAR's .fusion file opened in a text editor (example)
AnchorNamefusion file
-Fusion task set up
AnchorNameSTAR-Fusion task set up

Image Added

To change any of the advanced options, click the Configure link (Figure 16). To run the task, click Finish.


Numbered figure captions
SubtitleTextSTAR-Fusion advanced options
AnchorNameSTAR-Fusion advanced options

Image Added

The resulting Fusion predictions task node (Figure 17) can be downloaded to your local machine by selecting the data node and clicking Download data from the toolbox. There will be one tab-separated (.tsv) file per sample. To view the full table, double-click the new data node to open the task report (Figure 18). Each row of the table is a fusion event and the columns contain information about each detected fusion.

  • FusionName: the name of the fusion event, given as LeftGene--RightGene. Multiple fusion events can be detected across the same pair of genes, so the FusionName of an event is not necessarily unique;
  • JunctionReadCountindicates the number of RNA-Seq fragments containing a read that aligns as a split read at the site of the putative fusion junction;
  • SpanningFragCountindicates the number of RNA-Seq fragments that encompass the fusion junction such that one read of the pair aligns to a different gene than the other paired-end read of that fragment;
  • est_J: estimated junction read counts corrected for multiple mappings;
  • est_Sestimated spanning fragment counts corrected for multiple mappings;
  • SpliceTypeindicates whether the proposed breakpoint occurs at reference exon junctions as provided by the reference transcript structure annotations (Gencode);
  • LeftGene: name of the first (left) gene;
  • LeftBreakpoint: genome coordinates for the breakpoint in left gene;
  • RightGene: name of the second (right) gene;
  • RightBreakpoint: genome coordinates for the breakpoint in right gene;
  • JunctionReads: sequence identifiers for all junction reads;
  • SpanningFrags: sequence identifiers for all spanning fragments;
  • LargeAnchorSupportindicates whether there are split reads that provide 'long' (set to 25bp) alignments on both sides of the putative breakpoint;
  • FFPM: fusion fragments per million reads
  • LeftBreakDinuc: dinucleotide base pairs at the left breakpoint
  • LeftBreakEntropy: the Shannon entropy of the 15 exonic bases flanking the left breakpoint
  • RightBreakDinuc: dinucleotide base pairs at the right breakpoint
  • RightBreakEntropythe Shannon entropy of the 15 exonic bases flanking the right breakpoint
  • annotsprovides a simplified annotation for fusion transcript


Numbered figure captions
SubtitleTextSTAR-Fusion fusion prediction table
AnchorNameSTAR-Fusion fusion prediction table

Image Added


References

  1. Annala MJ, Parker BC, Zhang W, Nykter M. Fusion genes and their discovery using high throughput sequencing. Cancer Lett. 2013;340:192-200.
  2. Costa V, Aprile M, Esposito R, Ciccodicola A. RNA-Seq and human complex diseases: recent accomplishments and future perspectives. Eur J Hum Genet. 2013;21:134-142.
  3. Kim D, Salzberg SL. TopHat-Fusion: an algorithm for discovery of novel fusion transcripts. Genome Biology. 2011;12:R72
  4. TopHat-Fusion. An algorithm for discovery of novel fusion transcripts. http:// http://tophat.cbcb.umd.edu/fusion_index.html Accessed on April 25, 2014
  5. Dobin A, Davies CA, Schlesinger F et al. STAR: ultrafast universal RNA-seq aligner. Bioinformatics. 2013;29:15-21.
  6. Haas B.J, Dobin A, Li B. et al. Accuracy assessment of fusion transcript detection via read-mapping and de novo fusion transcript assembly-based methods. Genome Biol. 2019;20:213 (2019) 


Additional assistance




Rate Macro
allowUsersfalse

...