Partek Flow Documentation

Page tree

Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Table of Contents
maxLevel3
minLevel2
excludeAdditional Assistance

...


General 

How to create a project?

To create a project, you first need to transfer files to the Partek Flow server, and then import the files into your project using the import data wizard, here is the video and more information.

...

Click your avatar in the top right corner of the Partek Flow interface, choose Settings in the menu, and select Lists from the left panel of the Components section. Lists can also be generated from result tables using the "Save as managed list" button. For more information please click here

Can I repeat a task and everything downstream of it, while changing only one/a few parameters?

Yes, click on the rectangular task that you want to change the parameters. On the context-specific menu on the right, under Task actions, select ‘Rerun with downstream tasks’, this will bring you to the task set up page where you can edit the parameters for the task, then click Finish to run the task with the new parameters. The tasks downstream of it will be initiated automatically.

What can I use to identify cells that are actively expressing genes within a gene list?

...

We recommend cleaning up projects as well as removing library files that you do not need, then removing the orphaned files. You can also export analyzed projects and save them on an external machine, then when you need them again you can import them to the server. Please see this information for more details related to: Project managementRemoving library files, and Orphaned files. Right click on the data node to delete files from projects that are not needed (e.g. fastqs from project pipelines that are analyzed); you will not be able to perform tasks from this node once the files are deleted.

How do I add library files if I am not studying human or mouse?

To add a new assembly, click on Settings -> Library files. From the Assembly drop-down list, select Add assembly . Specify a species, if you can't find and specify the species. If the species name is not in the list, choose Other, Other and type in species the name and with the assembly version (you might have multiple assembly versions on can exist for one species, e.g. hg19 and hg38 assembly version on for Homo Sapiens). You need to add the reference file which is a .fasta file containing sequence information. Once the reference file is addadded, you can build any aligner index to perform the alignment task. 

The Annotation model is a file containing feature location, this file is used e.g. . This file can be used to quantify to annotation model in RNA-seq Seq analysis, or annotate variant or peaks in a DNA-seq Seq or ATAC-seqSeq/ChIP-seq Seq data analysis pipeline. The file format should be .gtf/.gff/.bed.

We recommend look looking for the species files of on the species on Ensembl website. There is no need to unzip or save these files to your local machine, instead right click and copy the link address of the specific file (not a link to a folder). For more details, here is the documentation chapter: Library File Management - Partek® Documentation

Are Genome coordinates 1-based or 0-based?

Genome coordinates for annotation models stored in Partek Flow are 1-based, start-inclusive, and stop-exclusive. This means that the first base position starts from one, the start coordinate for a feature is included in the feature and the stop/end coordinate is not included in the feature. These are the genome coordinates that are printed in various task reports and output files when an annotation model is involved in the task. When custom annotation files are added to Partek Flow, the genome coordinates are converted into this format. The coordinates are converted back if necessary for a specific task. shows how the genome coordinates vary between different annotation formats.

Image Added

Can I add transgenes to my reference files? 

...

Left click to select the data node you want to export. In the bottom of the task menu there will be an option to "Download data"

VisualizationVisualization 

How do I order my heatmap by the cell types?

If you would like specific groups (e.g. cell types) in a certain order, do not perform hierarchical Hierarchical clustering on these cells and instead choose to assign order, then use click and drag to reorder the groups. If you want to remove a group, you can choose to exclude this group in the filtering section. You can still perform hierarchical Hierarchical clustering on the features if you would like to. Hierarchical clustering will force the heatmap to cluster and you would need to click the dendrogram nodes to switch the order. Click here for more information. 

...

For a multi-sample project, all of the downstream tasks will be run separately if 'Split by sample' was checked when performing the PCA task. Visualization of different samples can be displayed by 'Sample' using the 'Misc' section in the 'axes' Axes card. To show different samples side by side, one can click 'Duplicate plot' first, then use the 'Sample' option to switch the samples.

...

Yes, the default settings can be modified by clicking "Configure" in the Advanced settings during task set-up, then change the "feature scaling" option to "none" to plot the values without scaling. For more information related to to the heatmap click here

Why

...

don't I see Flip mode on the heatmap? Why

...

can't I download all of the data after

...

zooming?

The Flip mode and download all data options are disabled if there are more than 2.5 million values (rows x columns) in the heatmap.

Statistics FAQs

How to label gene names on volcano plot?

By default, genes are selected if the p-value is <=0.05 and |fold change| >=2 and when the number of selected genes is less than 2000 genes, they will be labeled. You can click on Style button in Configure section, choose a gene annotation field from the Label by drop-down list to change the label. If you number of selected genes is select less than or equal to 100, Partek Flow will try to spread out labels as much as possible to clearly display the labels.  If number of selected genes is more than 100, labels will be next to the selected genes, there will be overlaps where genes are close together. If there are more than 2000 genes selected, no label will be displayed.

If you click any blank space, you can turn off select and use different selection mode button on the vertical bar on the upper-right corner of the plot to manually select dots on the plot.

Statistics 

Why do I get "?" for FDR p-values in my Deseq2 result?

...

Click here for troubleshooting other differential analysis models and "?" results

What is fold change?

Fold change indicates the extent of increase or decrease in feature expression in a comparison. In Partek Flow, fold change is in linear scale (even if the input data is in log scale). It is converted from ratio, which is the LSmean of group one divided by LSmean of group two in your comparison. When the ratio is greater than 1, fold change is identical to ratio; when the ratio is less than 1, fold change is -1/ratio. There is no fold change value between -1 to 1. When ratio/fold change is 1, that means there is no change between the two groups.

Log ratio option in Partek Flow is converted from ratio, this is a value comparable to log fold change in some other tools.

Can I label a Volcano plot with gene names?

Yes, go to Style in the Data Viewer and make sure Gene name is selected under "Labeling". Next, go to the in plot selection tools (right side of the graphic) and use any of the selection tools to select the cells that you would like to label. You can use ctrl or shift to select multiple populations at once. For more information on the Volcano plot click here

In Volcano plot, what is inconclusive group mean?

By default, Flow is using the p value <= 0.05 and |fold change|>=2 as the significance cutoff. If genes meet both p value and fold change cutoff, they are significantly up or down regulated genes. If they only meet one criteria, they are called inconclusive. If genes won't pass either criteria, they are not significant. Click on the Statistics button  in the Configure section in the left control panel, you can change the cutoff. Click on the Style button to change the color of significance categories.

What is the difference between FDR and FDR step up?

...

Yes, you can use the Compute biomarkers task to compare one subgroup at a time to all of the others combined. An alternative option is to set up the differential analysis model in this way; for more information please see the information here for each model. 

...

I

...

downloaded gene

...

counts from the output data node generated

...

by the Quantify to annotation model task, why

...

can't I find

...

my genes of interest?

In quantify the Quantifying to an annotation model dialog dialog, by default, Partek Flow perform filter filters features based on the total count across all of the samples , only and features with a total count is greater than 10 will be reported. If you want to report all of the genes in the annotation file, change the Filter features value to 0.

Biological Interpretation

...

What is the difference between GSEA and Gene Set Enrichment?

In Partek Flow, GSEA should be performed on a sample/cell and feature matrix data node (e.g. normalization count data). GSEA is used to detect a gene set/a pathway which is significantly different between two groups. Gene set enrichment should be performed on a filtered gene list; it is used to identify overrepresented gene set/pathway based the filtered gene list using Fisher's exact test. The input data is a filtered list using gene names.

What is the enrichment score shown in the Gene Set Enrichment report?

The enrichment score shown in the enrichment report is the negative natural log of the enrichment p-value derived from Fisher Exact test. The higher the enrichment score, the more overrepresented our list of genes in the gene set of a GO/pathway category.

In KEGG pathway, genes can be colored by Fold change and p-value etc, how are the gene statistics calculated?

For Gene set enrichment analysis, only genes from the input data node (filtered gene list) will be colored in the KEGG pathway gene network, using the statistics in the data node.

During GSEA (or Gene set ANOVA) computation, we also perform ANOVA on each gene based on the attributed selected independent from GESA computation (at gene set level). The results of ANOVA is only used to color the genes in the KEGG gene network. If GSEA is computed using another other database, e.g. GO, we don't compute ANOVA on each gene since GO databased doesn't have gene network information.

When should I use GSEA or Gene set ANOVA?

...