The Chromosome view in Partek® Flow® is a visualization tool for next-generation sequencing (NGS) and microarray data. The viewer can display different types of information, including aligned reads, genomic databases (e.g. genes, transcripts, or variants), isoform proportions, and reference sequence.
This chapter will illustrate how to:
- Launch the Chromosome view
- Navigate through the view
- Select data tracks for visualization
- Visualize the results using data tracks
- Annotate the results
- Customize the view
Launching the Chromosome View
Figure 1: Accessing Chromosome view via the toolbox (the content of the Visualisation section depends on the selected data node)
The Chromosome view can be invoked from some data nodes on the Analysis tab, giving a global overview of the results; or from certain Task report or result pages, providing a focused view, i.e. pointing to a specific feature of interest.
On the Analysis tab, selecting a data node containing aligned reads, variants, gene or transcript counts, or feature lists, shows Chromosome view in the Visualization section of the toolbox (Figure 1).
If Partek Flow has no information on the genome build, you will need to provide the species and genome build in a subsequent dialog (not shown). Otherwise, chromosome view will come up directly.
A new Chromosome view task node will be added to the canvas (Figure 2) and in order to invoke the viewer <double-click> on the node (you can also select it and then go to Task report in the toolbox). When invoked in the aforementioned way, the default visualization in the Chromosome view is the first 100,000 bases of the first chromosome.
Figure 2: Selecting Chromosome view from the toolbox adds a Chromosome view task node to the canvas. To open the view, <double click> on it
Another way to get the Chromosome view is through a Task report; you can launch the viewer by selecting the chromosome icon in the View column (Figure 3).
Figure 3: Accessing the Chromosome view from results table (mouse-over balloon is visible when hovering over the chromosome icon). The image is an example, based on gene expression pipeline
In that case, the Chromosome view will browse directly to the selected genomic location (i.e. a transcript or a variant, depending on the pipeline).
Navigating Through the View
A user can browse through the results by using one of the tools in the navigation bar (on the top of the view; Figure 4). Select tracks tool is the topic of a separate section, while the remaining tools are described below.
Figure 4: Navigation bar of the chromosome view (from left): Select tracks tool, Search box, Position box, mode selector (pointer, zoom, pan), zoom tool, bookmarks, save icon (the position in the figure is an example)
You can use the Search box ito zoom to genomic features that are available in the annotation track. Start typing a search term and Partek Flow will show you the first 10 suggestions (Figure 5). To select one, use the arrow keys or mouse, or type the full feature name and hit enter.
Figure 5: Search box of the Chromosome view. To zoom in on a feature, start typing the feature name; Partek Flow will show suggestions available in the corresponding annotation file (the current annotation is visible in the column on the right) (an example is shown)
The Position box enables the user to visualize a region in the genome. Coordinates are accepted in the following format: chromosome:start – end (zero-based). To show an entire chromosome, it is sufficient to enter just the chromosome number. The U-turn icon on the right takes you back to the original view, i.e. resets the zoom level to the view that was shown when the viewer was first opened.
Next, the mode selector (Figure 6) helps you to quickly navigate through the results.
Figure 6: Mode selector (from left): pointer mode, zoom mode, pan mode
When panning mode is activated, the appearance of the cursor will change to an arrow (Figure 7). Pointer mode provides details on any item (e.g. short sequencing read, variant, microarray probe, annotation feature) selected on the canvas. The selected item is highlighted by a green box (Figure 7).
Figure 7: Highlighted item in chromosome view: the microarray probe highlighted by a green box was selected using the pointer mode (microarray probes are used just as an example)
When zoom mode is activated, the appearance of the cursor will change to a plus . With the zoom mode on, you can magnify a specific region by positioning the cursor to the left of the area of interest and then <left-click> & drag the mouse to the right of the area of interest (Figure 8). When the viewer refreshes, it will come "closer" to the region that was selected (by halving the number of basis displayed on the screen).
Figure 8: Using <left-click> & dragging mouse to zoom into a region of genome (start magnification shown on the left). After releasing the left mouse button, Partek Flow will zoom into the highlighted region (right panel; an example is shown)
Alternatively, <left-click> on the canvas and Partek Flow will zoom in one level, by halving the number of bases visible on the screen. To zoom out one level Ctrl & <left-click> should be used; as a result, the number of visible bases will be roughly doubled.
When panning mode is activated, the appearance of the cursor will change to four arrows (Figure 9). <Left-click> and drag the canvas to the left or to the right to move upstream or downstream in the genome, respectively.
Figure 9: Appearance of the mouse cursor when the panning mode is on
Zooming out and in can also be achieved with the zoom tool (Figure 10) by moving the golden slider left or right, respectively, or by selecting the magnifying glass icons (– and +).
Figure 10: Zoom tool
The location of an interesting region can be bookmarked. Selecting the bookmark icon (i.e. the star) opens the dialog (Figure 11). To create a new bookmark, type the name of the region in the Create bookmark box and push Create.
Figure 11: Bookmark dialog.The 'B2M' is shown as an example of an existing bookmark
The next time you want to go directly to the same location, select the name of the bookmark (example in the Figure 11 lists B2M - exon #4 as the bookmark name) and Partek Flow will plot the region as defined in the Location column. To remove a bookmark, select the delete icon .
Once the plot has been modified, you can save the current appearance of the canvas by using the save icon . The resulting dialog (shown in Figure 12) enables you to change the image Format (options include: .svg, .png, .pdf), Size, and Resolution. The image will be saved in your Downloads directory.
Figure 12: Save image dialog. Upon selecting Save, saves the current visualisation on the canvas
Selecting Data Tracks for Visualization
Partek Flow plots genomic information on the canvas and is organized into horizontal sections called tracks. The exact number, type, and presentation of tracks depend on several factors, such as the underlying pipeline, available annotation, and the level of zoom. The tracks are added, removed, or customized via the Select tracks dialog (Figure 13).
Figure 13: Select tracks button opens the Select tracks dialog
The content of the Select tracks dialog depends on the data nodes present on the Analysis tab of the current project (an example is shown in Figure 14). Current pipeline is depicted in the center of the window, while data nodes that can be visualised are highlighted by the colour of their layer. Tracks can be turned on or off by selecting the check boxes in the list of possible tracks (and data nodes) on the right. To uncheck all, use the Clear selection button.
Figure 14: Select tracks dialog (the pipeline is an example). Data nodes that can be visualised are highlighted by the colour of their layer (sky blue in this example). Tracks can be turned on or off by selecting the check boxes in the list of data nodes and respective tracks (right panel)
For the ease of use, the pipeline and the list of tracks are linked: hovering over the track list highlights the matching data node in the pipeline and vice versa, i.e. selecting a data node in the pipeline panel highlights the respective node in the track list (Figure 15). Once you decided on the tracks that should be plotted, push Display tracks to depict them on the canvas.
Figure 15: Selecting data tracks for the visualisation using the Select tracks dialog (an example). Hovering over the track list highlights the matching data node in the pipeline and vice versa, i.e. selecting a data node in the pipeline panel (e.g. Isoform proportion track) highlights the respective node in the track list (Normalised counts and Isoform proportion).
Visualizing the Results Using Data Tracks
Data tracks section of the Select tracks dialog enables you to specify the tracks for visualization on the canvas. An overview of the available track types is provided in Figure 16. Note that not all tracks are visible at all times and that their presence depends on the zoom level. The tracks can be customised and their appearance changed by using the control panel on the right.
Alignments track
Isoform proportion track
Variants track
Amino acids track
Reads pileup track
Probe intensities track
Will change because of KB-5969
Figure 16: Data tracks in Chromosome view (examples)
Alignments Track
Alignments track displays a view of alignments present in .bam files in a stacked histogram fashion (similar to Partek® Genomics Suite®). The y-axis shows number of (raw) base calls per position. By default, reads are coloured by sample; the exception is invocation of the chromosome view on a variant table, when the reads are coloured by base calls. The difference is shown in Figure 17.
Reads coloured by sample
Reads coloured by base calls
Figure 17: Alignments track: different colouring options. When colouring reads by sample, the reads are stacked (on top of each other), i.e. in the example above there are more reads in the red sample than in the blue sample
Isoform Proportion Track
The Isoform proportion track displays the reads mapped to transcripts and helps to visualize differential expression and alternative splicing, using standard symbols for exons (boxes) and introns (lines connecting the boxes). The size of each transcript is proportional to the number of reads that map to that transcript. The color indicates the samples to which the reads belong. Figure 18 shows a gene with two transcripts in RefSeq database; the top transcript is more abundant than the bottom transcript and is preferentially expressed in the "blue" condition (labeled as 0 uM). The bottom transcript, on the other hand, seems to be expressed at the same level across all three conditions (i.e. 0 uM, 5 uM, 10 uM). The number and structure of transcripts on the plot depend on the transcript model that was used for mapping.
Figure 18: Isoform proportion track: the transcripts are shown as present in the transcript model that was used for mapping. Exons are depicted as boxes. The size of each transcript is proportional to the number of reads mapping to it. Colors indicate samples to which the reads belong
Variants Track
Variant tracks show single nucleotide variants (SNVs) and indels, and appear in the Select track dialog if Detect variants task has been performed. Presentation of variants depends on the level of zoom. With low power magnification, SNVs are seen as purple columns and indels are bars (insertions: green bars; deletions: red bars) (Figure 19).
Figure 19: Variants track at low power magnification: SNVs are symbolized by purple columns and an insertion is presented as a green bar (an example is shown). A deletion is presented as a red bar (none is visible on the figure)
Upon zoom-in, SNVs are drawn as pie charts, representing the proportion of each base call at that locus (Figure 20).
Figure 20: Variants track at high power magnification: each SNV is presented as a pie chart and each slice symbolises the relative frequency of each base call (an example is shown). Base call colour codes are given by the track name
At higher modification, insertions are seen as green boxes, with individual inserted bases presented using a pie chart, while deletions look like red boxes and the affected bases are also presented by a pie (Figure 21).
Insertion
Deletion
Figure 21: Variants track at high power magnification: insertion is presented as a green box, deletion is presented as a red box. An example is shown.
Amino Acids
Amino acids track becomes available in the Select tracks dialog after completing the Annotate variants task. The actual appearance of the track depends on the zoom level. With low-power magnification, you will see a message View not available at this zoom level, Please zoom in to view amino acids.
When you zoom closer to the genome, all the amino acids become visible as colored boxes (Figure 22) and labeled using the single-letter amino acid code. Alternative amino acids are depicted as additional boxe on the top of the consensus sequence.
Figure 22: Amino acids track at high power magnification: consensus amino acid sequence is at the bottom of the track, while a variant is shown on the top (change from Threonine to Proline is shown)
If an amino acid spans two exons, its box will be truncated and the line connecting the exons will be dashed. An example is in Figure 23.
Figure 23: Amino acids track: exon-spanning amino acids indicated by truncated boxes (i.e. Alanine on the left) (an example is shown)
An empty gray box on the top of consensus sequence is used to indicate a STOP codon, which is a consequence of a mutation (Figure 24).
Figure 24: Amino acids track: A variant which is in fact a STOP codon is represented by an empty box, as seen on the top of the G (an example is shown)
Untranslated bases, such as ones downstream of a STOP codon are depicted by lighter shades. Figure 25 shows two transcripts in an amino acid track; the direction is from left to right, so amino acids downstream of a STOP codon (P > G > L) are lightly shaded.
Figure 25: Amino acids track: amino acids downstream of a STOP codon are depicted by lighter shades. STOP codon is represented by "." in the middle, direction is from right to left (an example is shown)
Reads Pileup Track
Reads pileup track plots the short sequencing reads, as present in the .bam file. The track is not on by default (go to Select tracks to turn it on) and its appearance depends on the magnification; if you are zoomed out a message - Zoom in to view individual reads - will be displayed.
Forward strand reads are in sky blue, while reverse strand reads are in parakeet green. If paired-end chemistry was used, the paired reads will be depicted as half reads within a gray rectangle encompassing the pair (Figure 26). Singletons will be depicted as thicker reads
Figure 26: Reads pileup track: short sequencing reads are represented as bars. Paired-end reads are located within a gray box encompassing both pairs. Singletons, such as that on the top right, are depicted as thicker reads (an example is shown)
If you used a junction-aware aligner (such as TopHat or STAR), the junction reads will be depicted using dashed lines, which connect exon-spanning parts of the same read (Figure 27).
Figure 27: Reads pileup track: junction reads are depicted using dashed lines. A RefSeq track is added at the top, to visualise the exons (an example is shown)
Deleted bases can also be seen on a Reads pileup track, as fat black lines (Figure 28).
Figure 28: Reads pileup track: deleted bases depicted using fat black lines (an example is shown)
Probe Intensities Track
Microarray probes are visualised by the Probe intensities track. The probes are shown as bars and their colour depends on the probe intensity, ranging from white (low) to admiral blue (high) (Figure 29).
Figure 29: Probe intensities track: probes are depicted as bars and their colour reflects the intensity (an example is shown)
As with the Reads pileup track, probes may not be visible with low power magnification and you will see a message - Zoom in to view individual microarray probes.
Annotating the Results
Cytoband Track
By default, the Chromosome view shows a cytoband track at the top of the canvas. If a cytoband file for your genome has not been added to Partek Flow, a warning will appear (Figure 30). In that case, go to the Library File Management page and download or create a cytoband file.
Figure 30: Warning message indicating that Chromosome view can not be launched because of missing cytoband file
The red box (Figure 31) indicates the part of the chromosome that is currently depicted on the canvas.
Figure 31: Cytoband track: highlighted part is currently depicted on the canvas (an example is shown)
Reference Genome
The sequence of the reference genome is added to the Chromosome view by default, as long as it has been added to the respective genome on the Library File Management page. However, its presence (or absence) in the viewer depends on the current magnification. At low power, the track is hidden and you will see the message - Track hidden (zoom to view). At high power, on the other hand, the Reference genome track becomes visible (Figure 32) and is supplemented by the genomic coordinates (below the sequence). A vertical guide helps you to align the bases between Aligned reads and Reference genome tracks. Depending on the reference genome file, some bases may be shown in lowercase letters, symbolizing repetitive sequences, or other sequences masked by a tool such as RepeatMasker.
Figure 32: Reference genome track. Numbers beneath the sequence are coordinates
Variant Database
If a variant database file (such as dbSNP) for your genome is present on the Library File Management page, you will be able to include variant annotation track in your visualization (to add a variant database to the viewer, use the control panel on the right).
The variants will be shown adjacent to the Reference genome track (Figure 33). If the database contains no frequency information on alternative alleles, the alleles will be drawn as bars (an example is the SNP on the left in Figure 33). If the frequency information is available, the relative frequency of each variant will be represented by a column (the SNP on the right in Figure 33).
Figure 33: Reference genome track with added variant annotation: single nucleotide variants present in the chosen database are depicted as bars (if no frequency information is available) or columns (columns reflect relative frequency of each alternative allele as stored in the database)
Note that the frequency information for each allele will be parsed out from the chosen database. That information can be retrieved by selecting a variant using the selection mode and will be shown in the Selection details section of the control panel. Using the example shown in Figure 33, the details of the left database variant can be seen in Figure 34. The most frequent allele at that locus is G (hence, yellow column is plotted above the Reference genome track), which matches the base call of the reference genome.
Figure 34: Selection details section of the control panel showing details of a SNV, as present in the selected database
If your variant database stores indels, they will be depicted using green (insertion) or red (deletion) symbols (Figure 35) pointing to deleted bases.
Figure 35: Reference genome track with added variant annotation: insertions are shown in green, deletions in red. In this example, an insertion of a single base has described in the database, between G and T. An adjacent deletion of T and C bases has also been seen before
Other Annotation Tracks
Additional annotation tracks can be added to the viewer with the help of the Select tracks dialog (Figure 14) as long as they have been associated with the genome you are working on in the Library File Management.
A common choice of an additional track is a transcript database, such as RefSeq (Figure 36). All the database entries are displayed, using common depiction of exons as boxes and introns as lines connecting them. Untranslated regions (UTRs) are seen as narrow boxes. The arrows indicate directionality.
Figure 36: Transcript database track: a gene with two transcripts is shown as an example. Exons are plotted as boxes and introns as lines connecting them. Untranslated regions (UTRs) are seen as narrow boxes. The arrows indicate directionality
Customizing the View
Controls
Chromosome view can be customized by using the control panel on the left (Figure 37). The Attribute and Order By controls show options depending on the current project, while the content of the Annotate amino acids control depends on the annotation files associated with the current genome build in the Library File Management. In order for any change to take place, push the Apply button.
Figure 37: Control panel (an example is shown)
The first option, Group data by, specifies the number of Alignments tracks (Figure 38). All will result in only one track, with all the samples on it. Sample creates one track per sample, while Attribute produces one Alignments track per level of the Attribute (i.e. one track per group).
All
Sample
Attribute
Figure 38: Group data by: All creates one Alignments track for the entire project, Sample creates one Alignments track for each sample, Attribute creates one Alignments track for each group (an example is shown)
Annotate amino acids by controls the appearance of the Amino acids track and allows you to pick the transcript database that will be used to plot codons (Figure 39). The drop down list shows the databases currently available for the selected genome (additional databases can be added via Library File Management).
Figure 39: Annotate amino acids by: transcript models currently associated with the chosen genome are displayed in the drop-down list and can be used to plot Amino acids track (an example is shown)
Color by option affects the colouring of the Alignments track and Isoform proportion track. When Sample is selected from the drop-down list, individual samples will be shown on the aforementioned tracks, each sample being given a different colour. If attributes were assigned to samples, they will also be visible in the Color by drop-down (Figure 40) and you will be able to highlight levels of the selected attribute (Figure 41).
Figure 40: Color by: the options control colouring of Alignments and Isoform proportion tracks. Sample, Base, and Match options are present by default. If attributes have been assigned to samples, they will appear in the drop-down list. In this example, that is the "Tissue" attribute
Color by Sample
Color by <Attribute>
Figure 41: Difference between Color by Sample and Color by <Attribute>. Color by Sample uses different colours to depict individual samples; Color by <Attribute> uses different colours to depict levels of the selected sample attribute (as present in the Data tab). Alignments and Isoform proportion tracks are shown (an example)
The effect of the option to Color by Base can be seen with high power magnification (Figure 42). Individual base calls are highlighted by different colours. When that option is chosen at low power magnification, all the bases are shown in grey.
Figure 42: Color by Base highlights the base calls by colours. Different colours are visible with high power magnification; otherwise all the bases are shown in gray (an example)
Finally, Color by Match can be used to quickly identify mismatches against the reference genome. A matching base is coloured in blue, while mismatch bases are shown in yellow.
The maximum of the y-axis of Alignments tracks is set by Read histogram Y axis scales option (Figure 43). When using Independent, the y-axis for each track is set individually, based on the maximum within that sample. On the other hand, Linked uses the maximum across all the samples and uses that value as the maximum for all.
Independent
Linked
Figure 43: Read histogram Y axis scales. When set to Linked, all the tracks have the same Y axis maximum, which depends on the sample with the highest coverage. Using Independent sets Y axis maximum independently for each sample.
Read histogram type changes the presentation of the Alignments track and should be used in conjunction with the Group data by and Color by tracks to get the desired visualisation.
When set to Sum, the Read histogram type shows the sum of base calls at each position, i.e. total coverage per position. Figure 44 shows an Alignments track with three samples. With the Sum option, number of reads at each base in each sample is added and displayed. Contribution of individual samples is not visible, since the track is Colored by Group (but that would make sense in this example).
Figure 44: Alignments track: total coverage per locus is shown by using "Read histogram type" set to "Sum" and "Group data by" set to <Attribute>
To show average coverage per locus, switch Read histogram type to Average and leave Color by as is (i.e. by group) (Figure 45). With this setting, Chromosome view will calculate the average by dividing the total coverage per locus by the number of samples. Note that using Color by Sample would not make sense here. Although Figure 44 looks quite like Figure 43, the y-axis range is different.
Figure 45: Alignments track: average coverage per locus is shown by using "Read histogram type" set to "Average", "Group data by" set to "Attribute", and "Color by" set to <Attribute>.
Finally, the option Overlay is useful if you want to directly compare base counts over several samples (or groups) as each will be represented by a line (i.e. no stacking). Example in Figure 46 is based on microarray data, showing three groups on the same Alignments track. The red group has the highest base counts, while the counts in the blue group are much lower.
Figure 46: Alignments track: coverage per locus is shown by using "Read histogram type" set to "Overlay". Each plot is a single experimental condition ("Group data by" set to "Attribute", "Color by" set to <Attribute>). Lines are rectangular since microarray data is used (an example)
Next, you can use the Transcript label selector to specify labels on the reference transcript track and Isoform proportion track (Figure 47).
Transcript label: Gene
Transcript label: Transcript
Figure 47: Transcript label: setting the control to Gene shows only gene label, while Transcript shows transcript labels. Both transcript database and Isoform proportion tracks are affected
Short sequencing reads can be coloured by strand (Reads pileup color: Strand) or by base (Reads pileup color: Base). Both options are illustrated in Figure 48.
Reads pileup color: Strand
Reads pileup color: Base
Figure 48: Reads pileup color: colouring of the short sequencing reads by Strand or by Base
Probe color control customizes the appearance of Probe intensities track (Figure 49). When set to Intensity, colour of a probe reflects its intensity, using a colour gradient from white (low) to admiral (high). Alternatively, when Strand is turned on, probes on the reverse strand are in parakeet green, while probe on the forward strand are in sky blue.
Probe color: Intensity
Probe color: Strand
Figure 49: Probe color: "intensities" colors probes proportionally to their intensity, "strand" uses colors to indicate probe positioning (an example is shown)
If a variant database is available for the current genome, the variants can be added to the Reference genome track (Figure 33). To show the variants, point the Variant database control to the database of your choice.
To change any of the colours on the canvas, use the Customize track colors tool. A resulting dialog will help you to pick another color (drop-down button opens the colour-picker) (Figure 50).
Figure 50: Customize colors dialog: selecting a drop-down arrow opens the color-picker tool
Track Order
The position of the tracks on canvas can be controlled by using the Track order tool. If you want a track to be visible all the time, i.e. while scrolling up or down, pin it to the top or to the bottom. Figure 51 shows Cytoband track pinnned to the top of the canvas and Reference genome track pinned to the bottom of the canvas. To unpin a track, click on the pin icon ( ). The track will be unpinned and a message No tracks are pinnned to the top / bottom will appear. To pin a track, drag the track name to the No tracks… message. Alternatively, you can use the green arrows ( ) to pin a track. When you mouse over an arrow, the new position of the track will be highlighted on the canvas; click on the arrow to accept.
A track can be hidden (meaning it will not be visible) by selecting the red minus, or unhidden by selecting the green plus icon.
The tracks can be reordered by drag and drop.
Figure 51: Track order tool: To change the position of a track drag and drop to the new position. To pin a track to the top / bottom of the canvas, use the up and down arrows. To unpin a track, select the pin icon. A track can be hidden by clicking on the red minus symbol and unhidden by selecting the green plus. Coloured dot by a track names indicates the layers to which the track belongs (an example is shown)
Selection Details
At the bottom of the control panel you will find the Selection details section (Figure 52). It is used to display information on the element selected on the canvas (using the Pointer mode).
Figure 52: Selection details showing information on the element selected on the canvas. The example shows details of a microarray probe. Note the two link-outs ("Browse on UCSC" and "BLAST this sequence")
Additional Assistance
If you need additional assistance, please visit partek.com/PartekSupport to submit a help ticket or find regional phone numbers to call Partek support.
---
Last revision: March 29, 2016
Copyright © 2016 by Partek Incorporated. All Rights Reserved. Reproduction of this material without express written consent from Partek Incorporated is strictly prohibited.