Partek Flow Documentation

Page tree

Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Reference sequences are the chromosome/scaffold/contig DNA sequences for a species. A reference sequence file is typically in FASTA or 2bit format. The reference sequence of a species is used for aligner index creation, variant detection against the reference sequence and visualization of the reference sequence in the Chromosome view.

...

Gene set files are required for biological interpretation analyses (e.g. GO enrichment). Genes are grouped together according to their biological function. Gene set files have to be in GMT format, where each row represents one gene set. The first column of a GMT file is the GO ID or gene set name. The second column is an optional text description. Subsequent columns are the gene symbols that belong to each gene set. Gene ontologies for various model organisms are available for automatic download from the Partek repository (source: geneontology.org). Because gene ontologies are frequently updated, geneontology.org is checked for updates quarterly. You can check for recent updates to the Partek repository on the Partek website here.

Variant annotations

Variant annotation databases are collections of known genomic variants (e.g. single nucleotide polymorphisms). If you have performed a variant detection study, detected variants can be searched against variant annotation library files to see if the detected variants are known from previous studies. Furthermore, you can validate detected variants against 'gold-standard' variant annotation library files. Variant annotation files are typically in VCF format.

...