Partek Flow Documentation

Page tree

Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Table of Contents
maxLevel2
minLevel2
excludeAdditional Assistance

How to import a study from GEO/ENA

If a project is publicly available in the GEO and ENA databases, you can import associated FASTQ files, sample attributes, and project details automatically into Partek Flow.

...

The format of a BioProject ID is PRJNA followed by one to six numbers (e.g., PRJNA291540). The format of a GEO ID is GSE followed by one to five numbers  (e.g., GSE71578). 

  • Click Import project 

Imported projects from GEO/ENA

The data tab will be populated with sample information. Sample names will be GSM IDs for each sample. Attributes and attribute levels are drawn from the GEO sample characteristics information (Figure 2).

 

Numbered figure captions
SubtitleTextGEO import populates the data tab
AnchorNameData tab after GEO ENA import

Image Added

Project details are added to the Project settings tab (Figure 3). The project name is the first 54 characters taken from the BioProject ID title. The project description is the BioProject description with the GEO ID and BioProject IDs appended.

 

Numbered figure captions
SubtitleTextProject details from ENA
AnchorNameProject settings page GEO ENA import

Image Added

The Analyses tab will include an Unaligned reads data node once the data download has started (Figure 4). It may take a while for the download to complete depending on the size of the data. FASTQ files are downloaded from the ENA BioProject page. 

Image Removed

Image Removed

Image Removed

 

Numbered figure captions
SubtitleTextFASTQ files will be added as an Unaligned reads data node in the Analyses tab
AnchorNameAnalyses tab after GEO ENA import

Image Added

Common Issues

 

 

 

Error message - Data not found, please check the project ID and try again

If the study is not publicly available in both GEO and ENA, project import will not succeed.

The project was imported, but there is no data

If there is an ENA project, but the FASTQ files are not available through ENA, the project will be created, but data will not be imported. 

Something is missing or the import failed

A variety of other issues and irregularities can cause imports to not succeed or partially succeed, including, but not limited to, a BioProject having multiple associated GSE IDs, incomplete information on the GEO or ENA page, and either the GEO or ENA project not being publicly available. 

...