Partek Flow Documentation

Page tree

Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

 

Next generation sequencing can produce anywhere from hundreds of thousands to tens of millions short nucleotide sequences for a single sample.  For any given base within an individual sequence there can also be a quality score associated with the confidence of that base call from the sequencer.  The process of alignment is used to map all of these reads to a reference sequence, providing information with regards to the start and stop positions of each read within the reference sequence as well as a quality metric for the mapping. This document will provide information about the available aligners within Partek ® Flow ® as well as illustrate how to perform alignment against a reference sequence.  The result of alignment will be an Aligned reads data node that contains the BAM files generated from the alignment.

...

Alignment tools appear in the context-sensitive menu on the right of the screen (Figure 1) when click on an any data node containing FASTQ files. Examples include Unaligned reads, Trimmed reads, or and Subsampled reads data nodenodes..


Numbered figure captions
SubtitleTextShowing Aligners from a trimmed reads node
AnchorNameshowing aligners

...

GSNAP5 (Version 2015-12-31(v8)) - A short read aligner (>14bp) using a successive constrained search, capable of handling splicing using either a probabilistic model or database.  Built to handle SNPs in alignment.  Good sensitivity but slower speed and higher memory usage.  Popular for RNA-seq analysis.  (http://research-pub.gene.com/gmap/)

Isaac 26 HISAT2(Version 2.1.0) - A fast and sensitive alignment program for mapping next-generation sequencing reads (both DNA and RNA) to a population of genomes. HISAT2 is a successor to TopHat2. (https://github.com/DaehwanKimLab/hisat2)

Isaac 27 (Version 15.07.16) - Gapped aligner that finds candidate mapping positions by matching 32-mers from the data to 32-mers from the reference, extending the candidate mappings to the whole read, and selecting the best mapping. Has utility for mappying DNA-Seq with good speed and acuracy accuracy but high memory usage.  (https://github.com/Illumina/isaac2)

STAR7 8 (Version 2.56.3a1d) - Splice-aware aligner that utilizes novel sequential maximal mappable seed search capable of handling splice junctions.  Seeds are subsequently stitched together by local alignment.  Capable of handling long reads.  Good speed and sensitivity for RNA-seq analysis but with high memory usage.  (https://github.com/alexdobin/STAR)

TMAP8 9 (Version 5.0.0) - Integrates a set of aligners to (including modified BWA) to identify candidate mapping locations and performs alignment using Smith-Waterman algorithm.  TMAP is optimized to handle variable length reads and error profiles generated by Ion Torrent data.  (https://github.com/iontorrent/TMAP)

TopHat9 10 (Version 1.4.1 with Bowtie 1.0.0) - Two stage aligner that first utilizes Bowtie to map to a reference and subsequently unaligned reads are are mapped to a database of possible splice junctions.  Popular for RNAseq analysis with solid performance, speed, and memory usage. (https://ccb.jhu.edu/software/tophat/index.shtml)

TopHat 210 11 (Version 2.1.0) -   A newer version of TopHat that utlizes Bowtie2 and refined algorithms from Tophat to improve both speed and accuracy.  Popular for RNAseq analysis with solid performance, speed, and memory usage. (https://ccb.jhu.edu/software/tophat/index.shtml)

...

Selecting an aligner will open the task dialog (Figure 2).  All aligners will have an index selection section where the genome build for the species of interest must be entered for Assembly and the Aligner Index must be specified.  Aligner indexes provide a means to break apart the reference sequence for fast sequence matching, and can be created for the whole genome or for regions of interest in a Gene/Feature annotation file.  Adding Reference Aligner Indexes or Adding Aligner Indexes based on an Annotation Model can be performed via Library File Management or built on the fly.  If using STAR, TopHat, or TopHat2, a Gene/Feature annotation file will present the option to Align to either the Transcriptome of the Genome and Transcriptome (Figure 3).  Selecting Transcriptome aligns to regions specified in the annotation file and selecting Genome and transcriptome will use the annotation file as a guide for mapping to the genome.

 

  


Numbered figure captions
SubtitleTextExample of an aligner task dialog for STAR
AnchorNameGeneral Alignment

Image Modified


The Alignment options section is available for all aligners and will have includes the option to Generate unaligned reads.  Selecting this option will create a new fastq file for each sample in the project that contains the reads that do not map during the alignment process. 

In addition certain , some aligners have additional options specific to that tool.  BWA allows for selection of the Alignment algorithm, including backtrack, MEM and SW (see BWA documentation).  GSNAP has multiple options for Alignment mode (see GSNAP documentation).  Both TopHat and TopHat2 have the option to select Fusion search (see Fusion Gene Fusion Detection). 

The Advanced options section allows for the customization of option sets (see Option Set Management), which allows for the ability to specify parameters specific to each aligner.  Default parameters are those specified by the developer of each aligner and parameter details found in the documentation for each aligner.  

...

5. Wu TD, Nacu S. Fast and SNP-tolerant detection of complex variants and splicing in short reads. Bioinforma Oxf Engl. 2010;26(7):873-881.

6. Kim D, Langmead B and Salzberg SL. HISAT: a fast spliced aligner with low memory requirements. Nature Methods 2015

7. Raczy C, Petrovski R, Saunders CT, et al. Isaac: Ultra-fast whole genome secondary analysis on Illumina sequencing platforms. Bioinformatics. June 2013:btt314.

78. Dobin A, Davis CA, Schlesinger F, et al. STAR: ultrafast universal RNA-seq aligner. Bioinforma Oxf Engl. 2013;29(1):15-21.

89. Torrent Suite User Documentation : Technical Note - TMAP Alignment (https://ts-pgm.epigenetic.ru/ion-docs/Technical-Note---TMAP-Alignment_9012907.html).

910. Trapnell C, Pachter L, Salzberg SL. TopHat: discovering splice junctions with RNA-Seq. Bioinforma Oxf Engl. 2009;25(9):1105-1111.

1011. Kim D, Pertea G, Trapnell C, Pimentel H, Kelley R, Salzberg SL. TopHat2: accurate alignment of transcriptomes in the presence of insertions, deletions and gene fusions. Genome Biol. 2013;14:R36. 


Additional assistance


 

Rate Macro
allowUsersfalse

...