Partek Flow Documentation

Page tree

Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

To invoke hierarchical clustering, select a Quantification data node (to cluster samples) containing count data (e.g. Gene counts, Normalized counts, Single cell counts), or a Feature list data node (to cluster significant genes/transcripts) and then click on the Hierarchical clustering / heat map option in the context sensitive menu (Figure 1).

...

Numbered figure captions
SubtitleTextHierarchical clustering as a part of visualisation tools
AnchorNameHierarchical clustering

Image RemovedImage Added

The hierarchical clustering setup dialog (Figure 2) enables you to control the clustering algorithm. Starting from the top, you can choose to Cluster samples, Cluster features (genes/transcripts) or both. By default, if there are less than 3000 samples, the Cluster samples check button is selected. Otherwise the check button is de-selected. If Cluster samples is unchecked, the Ordering option becomes active (see below).


Numbered figure captions
SubtitleTextSetup dialog of hierarchical clustering (default settings)
AnchorNameHierarchical clustering dialog

Image Modified

If you do not want to cluster all the samples, but select a subset based on a specific sample or cell attribute (i.e. group membership), check the Filtering option and set a filtering rule using the drop down lists (Figure 3). The default value of the Filtering option is All samples.


Numbered figure captions
SubtitleTextSpecifying a subset of data for clustering, based on sample attributes. In the example on the figure, only Control samples will be clustered, while the Treatment samples will be omitted
AnchorNameSubset specification

Image Added

Cluster distance metric for samples and features is used to determine how the distance between two clusters will be calculated .(Figure 4):

  • Single Linkage: the distance between two clusters is determined by the distance of the closest objects in the two clusters

...

  • Complete Linkage: the distance between two clusters is equal to the distance between the two furthest members of those clusters

...

  • Average Linkage: the average distance between all the pairs of objects in the two different clusters is used as the measure of distance between the two clusters

...

  • Centroid method: the distance between two clusters is equal to the distance between the centroids of those clusters

...

  • Ward's method: the distance between two clusters is designed to minimize the size of an error measure based on the sum of squares

...


Numbered figure captions
SubtitleTextThe cluster distance metric can be chosen for samples and features
AnchorNameCluster distance metric

Image Added

Point distance metric is used to determine the distance between two rows or columns. For more detailed information about the equations, we refer you to the distance metrics chapter.

If you do not want to cluster all the Cluster samples, but select a subset based on specific sample attributes (i.e. group membership), use the Filtering drop down list (Figure 3). The default value of the Filtering option is All samples.
  box is unchecked, the Ordering option becomes active (Figure 5). Choose an attribute from the drop down list. Click and drag to rearrange the order of groups.


Image Removed
Numbered figure captions
SubtitleTextSpecifying a subset of data for clustering, based on sample attributes. In the example on the figure, only Control samples will be clustered, while the Treatment samples will be omitted
AnchorNameSubset specification
The order of samples in the heatmap can be customized
AnchorNameOrdering

Image Added

You can choose how the data is normalized. Under the Normalization mode dropdown, Standardize (default) will make each column mean as zero and standard deviation as 1 in all features. This is the default normalization and it makes

...

all the features (e.g., genes) have equal weight. Standardized values are also known as Z-scores. The normalization mode Shift will make each column mean as zero. Choose None to perform clustering on the values in the quantified data node.

...

The data can also be Log2 transformed on the fly.

Another way to invoke heatmap without performing clustering is in the data viewer. When you select the Heatmap Image Added icon in the available plots, data nodes that contains two dimension matrix can be use to draw this type of plot.

Heat Map

The output of a Hierarchical clustering task is a heat map (Figure 46) with or without dendrogram depends on whether you perform cluster on samples/cells or features. By default, samples are on rows (sample labels are displayed as seen in the Data tab) and features (genes or transcripts, depending on the input data) on columns. Colors are based on standardized expression values (default selection; performed on the fly), . Dendrograms show clustering of rows (samples) and columns (variables).

...

Numbered figure captions
SubtitleTextHeat map. Samples are on columns, variables (in this example: genes) on columns, and the heat map is based on standardised gene expression values
AnchorNameHeat map

Another way to invoke heatmap without performing clustering is in data viewer. When select Heatmap (Image Removed) icon in the available plots, data nodes that contains two dimension matrix can be use to draw this type of plot.

Depending on the resolution of your screen and the number of samples and variables that need to be displayed, some binning may be involved. If there are more than samples/genes than pixels, values of neighboring components will be averaged together. When you zoom in to certain level, you will see each cell represent one sample/gene. To zoom, use the mouse wheel to zoom in / out. To move the map around when zoomed in, press down the right button of the mouse and drag the map.  

There are 5 sections in the Configuration panel (Figure 57): Content, Heatmap, Dendrograms, Annotations and layout Layout. Click on the triangle () to expand the section to configure.

...

Numbered figure captions
SubtitleTextHeat map controls
AnchorNameHeat map controls

Content

...

Content contains the value of which matrix data is This section controls the data source used to draw heatmap the values in the plotheatmap. Heatmap The heatmap is a color presentation representation of the values in the matrix selected. Most of In addition to color, you can also use the Size drop down list to size by a set of values. Most of the data nodes contains contain only one matrix, which you might not need to use Size by configuration to use the same value represent the same information as color by (Figure 6so the only options available in the Size drop down are None or Matrix (Figure 8).


Numbered figure captions
SubtitleTextWhen data node contains only one matrix, not need to use size by configuration
AnchorNamecontent of one matrix

However, if a data node contains multiple matrix information, e.g. if you perform descriptive statistics on cluster groups for every gene like mean, std. dev, percent of detected cells, etc, each stats result will be in a separate matrix in the output data node. You might want to use color of the component in the heatmap to represent one type of stats (like mean of the groups) and size of the component to represent the information from a different statistic information (like std. dev) (Figure 79). 


Numbered figure captions
SubtitleTextWhen a data node containing more than one matrices, different matrices can be used to color and size the components
AnchorNameContent of two matrices

Heatmap

...

Heatmap This section is used to configure the color and shape of the components in the heatmap (Figure 810) .


Numbered figure captions
SubtitleTextConfigure heatmap color and shape
AnchorNameheatmap

In the color palette horizontal bar, the left side color represent low value, represents the lowest value and the right side color represent high represents the highest value in the matrix data represented. By default, there are 3 tabs () present the min: minimum, middle, and max maximum color value of the default range calculated on the matrix. Left-click on the middle tab and drag left/right can change the middle the value this tab represents. When If you left-click on the middle tab and release the mouseonce, you can change the color and value this tab represents ((Figure 911).  Click on () to remove this tab.

...

Click on the little triangle next to the color square () to choose a color to represent the value by clicking on a color or type in the RGB color of the color, click OK (Figure 1012).


Numbered figure captions
SubtitleTextSelect a color from the color palette
AnchorNamechoose color

Image Modified

The min and max tabs cannot be dragged or removed, however, when left . If you left-click on itthem, you can choose a different color. To change the min and max color value to be represented, use the Range section. When When you click on the Palette bar, you can add a new color tab between min and max (Figure 1113). Adding a tab can be useful when there is an outlier value in the data, you . You can use a different color to represent different value range.

...

To change the min and max value threshold values represented , in the Range section, click by the color palette, click on the toggle switch () to make it bluein the Range card, and specify the value values in the text boxboxes.   

The shape of the heatmap cell (component) can be configured either as a rectangle or circle by selecting the radio button in the shape section.

Dendrograms

...

Only if If cluster analysis is performed on samples and/or features, the result will be displayed in dendrograms. By default, the dendrograms are all colored in black (Figure 4 6).

The color of the dendrograms can be configured (Figure 1214.)

Numbered figure captions
SubtitleTextConfigure the color of dendrograms
AnchorNamedendrogram

...

When the By cluster  in the Row/Column color drop-down list,  the number of clusters needs to be specified (Figure 1315).  The top N number of clusters will be in N different colors. 

Numbered figure captions
SubtitleTextConfigure the dendrogram color based on number of cluster specified
AnchorNamedendrogram color by cluster

Annotations

...

This section allows you to add sample or cell level annotation to the viewer. First, make sure to choose the correct data node which contains the annotation information you would like to use by clicking the circle (). All project level annotation will be available on all data node nodes in the pipeline (Figure 14).16). Choose an attribute from the Row annot drop-down list. Multiple attributes can be chosen from the drop-down list and can be reordered by clicking and dragging the groups below the drop-down list.


Numbered figure captions
SubtitleTextConfigure sample or cell annotation
AnchorNamerow annotation

Each attribute is represented as an annotation bar next to the heatmap, different color represent different group . Different colors represent the different groups in the attribute. The  The width of the bar can be adjusted by Block size slider , when the show Show labels toggle switch is on , the . The text of the label font size can be changed by specifying pixel size. The Fill blocks toggle switch adds or removes color from the annotation labels.

Layout

To change the orientation of the plot, click on the toggle switch of Data Transpose  () in the Layout section (Figure 1517).


Numbered figure captions
SubtitleTextConfigure the layout of the plot
AnchorNamelayout

Axes labelsAxis titles, dendrograms, and the legend can be turned on or off by clicking the relevant toggle switches. The axes label font size can be changed by specify specifying the number of pixels.

In-plot controls

In the mode bar of the viewer , besides selection buttons, (in the top right corner of the heatmap), there is a special button just for hierarchical clustering–Flip Flip mode button (), in this . In flip mode, when you can click on a line in the dendrogram which represents a cluster branch in dendrogram, it will swap and the location of the two legs of the branch will be swapped.

In move mode (Image Added), you can left-click and drag to move around the heatmap if you are zoomed in. Left-clicking once on the heatmap or on a dendrogram branch will select that rows/column. 

In selection mode (Image Added), you can click and drag to select a range of rows, columns, or components.








Additional assistance



Rate Macro
allowUsersfalse

...