Page History
...
This chapter covers will illustrate how to detect fusion genes by:
Children Displaytoc | ||
---|---|---|
|
|
Partek Algorithm
General Overview
The Partek® Flow® fusion detection algorithm uses paired-end information to find pairs of genes that may express as a hybrid. A paired-end read is considered for a fusion event if:
- an alignment from the first-in-pair maps to a different sequence (chromosome) than an alignment from the second-in-pair, or;
- the distance between all alignments from the first-in-pair and the second-in-pair exceed a custom-defined threshold (default: 50 kb).
The algorithm then reports peaks of reads that are potentially involved in a fusion event. Adjacent peaks are merged if their distance is less than 200 bp (default) and the probability that the peak is derived from the null distribution of peaks (determined by permutation) is reported. False positives hits are reduced by ignoring alignments that overlap with regions masked in the .2bit file. Finally, the peaks are annotated with a transcript model and a report is generated for pairs of peaks which map to different transcripts.
Running Partek Fusion Gene Algorithm within Partek Flow
Partek algorithm can be invoked on a data node containing aligned paired-end reads (i.e. Aligned reads node), through the Detect fusion genes link in the Variant detection section of the toolbox (Figure 1).
Numbered figure captions | ||||
---|---|---|---|---|
| ||||
First, the genome build that should be used for fusion gene detection needs to be specified (Figure 2).
Numbered figure captions | ||||
---|---|---|---|---|
| ||||
The next dialog (Fusion options; Figure 3) allows for optimization of several parameters. Min distance between ends specifies the minimum distance (bp) between first in pair and second in pair reads to be considered for a fusion event, while Window gap (bp) defines the minimum distance that needs to be detected between two neighboring fusion candidates in order to label them as independent fusion events. The Annotation model is required to annotate the components of the fusion gene in the output table (see below).
Numbered figure captions | ||||
---|---|---|---|---|
| ||||
As a result, a new data node (Fusion) will be created (Figure 4). Selecting the Fusion node opens the toolbox and the list of fusion genes can then be reached via the Task report link.
Numbered figure captions | ||||
---|---|---|---|---|
| ||||
An example of the output, i.e. Fusion report, is shown in Figure 5. Each row of the table is a potential fusion event, with the columns providing the following information.
- Chromosome1: chromosome ID for the gene on the left side of the fusion;
- Start1: start position of the segment on the left;
- Stop1: stop position of the segment of the left;
- Chromosome 2: chromosome ID for the gene on the right side of the fusion;
- Start2: start position of the segment on the right;
- Stop2: stop position of the segment on the left;
- Sample ID: sample in which the fusion event was identified;
- Counts: number of supporting reads;
- p-value: p-value for the chi-squared test comparing the observed number of counts against the expected number (background distribution);
- Gene1: gene on the left side of the fusion;
- Gene2: gene on the right side of the fusion.
All the columns can be sorted by using the arrow buttons () in column headers.
Numbered figure captions | ||||
---|---|---|---|---|
| ||||
Additional assistance |
---|
|
Page Turner |
---|
...