MicroRNAs are short, noncoding RNAs that regulate gene expression. In its mature form, they bind to messenger RNA molecules and ultimately mark them for degradation.
MicroRNAs are transcribed in the nucleus and are exported out to the cytoplasm as precursor microRNAs of ~120 - 160 nucleotides in length. The precursor is then processed in the cytoplasm into mature microRNAs of ~18 -22 nucleotides in length.
When microRNAs are sampled using NGS, the reads are generally shorter than 50 nucleotides in length. In cases where mature microRNAs are sampled, they can be as short as 18 nucleotides.
Because of the unusual nature of these reads, Partek® Flow® provides a MicroRNA pipeline which users can download and use for analyzing microRNA-Seq data. This is a guide on how to analyze microRNA sequencing data using the MicroRNA Pipeline. It will cover the following:
The MicroRNA pipeline is composed of a series of tasks that will assess and trim microRNA-Seq reads, followed by alignment to a reference genome and quantification against a microRNA annotation database (Figure 1). The pipeline is transferrable for any species.
Tasks associated with the pipeline
The Pre-alignment QA/QC task will assess the quality of the reads imported into the project. The results of this task can be used to inform additional processing that you can do to improve the results of your microRNA-Seq analysis.
The Trim bases task trims reads on both ends with a Phred quality score cutoff of 30. The task is also set-up so that it discards reads shorter than 15 nucleotides in length. This ensures that high-quality reads are analyzed, while making sure that short mature miRNA reads are not discarded.
Reads are then aligned using the Bowtie algorithm , which works best with short, high quality reads. Its advanced settings have been modified to decrease the seed length and mismatch allowances of the alignment. This further accommodates short reads. You can align to any reference genome of your choice. Keep in mind that the reference genome must have a compatible microRNA annotation. The Post-alignment QA/QC task then assesses how well the reads aligned to the reference genome.
Finally, the Quantify to transcriptome task will quantify the aligned reads against a microRNA annotation database. After completion of the pipeline, downstream analyses such as detecting differentially expressed genes can be performed.
Importing MicroRNA Annotation Libraries
The pipeline requires a suitable microRNA annotation database for quantification. We recommend quantifying against the miRBase annotation database (www.mirbase.org). This database incorporates deep sequencing datasets and community input to produce high confidence microRNA annotations for many different organisms . Prior to running the MicroRNA pipeline, make sure that the appropriate microRNA annotation database is uploaded into Partek Flow.
Downloading annotations available from Partek
For the model organisms human, mouse and rat, Partek provides miRBase gene/feature annotations for download. You can include these annotations using Partek Flow’s Library file management settings.
To download the annotation, go to Settings>Library file management. Select the assembly for your organism using the Assembly drop down menu and click the Add library file button (Figure 2). Note that miRBase annotations are only supplied for one assembly per organism. For instance, the human miRBase 21 annotation is only supplied for assembly hg38.
For additional information, refer to the Library File Management section of the Partek Flow user manual.
Downloading annotations from miRBase and other sources
In addition, miRBase supplies annotations for ~200 additional species including plants, some viruses and other animals. You can download these annotations in .gff3 form directly from the miRBase ftp site: ftp://mirbase.org/pub/mirbase/CURRENT/genomes/
To upload these into Partek Flow, add the corresponding genome assembly using the Library file management section. A list of miRBase-compatible genome assemblies can be found on the miRBase website: http://www.mirbase.org/help/genome_summary.shtml
If available, you can also upload your own custom microRNA annotations. Make sure these are in .gtf/.gff/.bed format.
To upload your annotation, go to Library file management and select the Assembly corresponding to the annotation. Then select the Add library file button. Select Gene/feature annotation in the Library type drop down menu. Under the Annotation model drop down menu, select Add annotation model and type a name for the annotation model. Select the correct annotation model and pick microRNA under the Annotation data type drop down menu (Figure 3).
Preparing a New Project for the Pipeline
To run the MicroRNA pipeline, you must have unaligned reads uploaded to your project. Refer to Partek Flow documentation on “Creating a new project” for more information on how to create a new project and importing samples for microRNA-Seq analysis. This can be downloaded at: http://www.partek.com/resources-partek-flow
Importing and Running the Pipeline
Importing the microRNA pipeline
To import the Partek MicroRNA pipeline, open your microRNA-Seq project and click Import a pipeline on the bottom left of the Analyses tab. Select the Partek website radio button to list all pipelines available on our website. Find the MicroRNA pipeline and click the Import pipeline button (Figure 4).
Running the pipeline
To run the MicroRNA pipeline, open your microRNA-Seq project with the unaligned data already imported. Select the Unaligned reads data node. This will open a context sensitive menu on the right (Figure 5). Under the Pipelines section, select MicroRNA-Bowtie. It will then ask you to specify the assembly and annotation database you want to use.
Options for Downstream Processing
After the pipeline is completed, you can perform differential expression analysis using either our Gene Specific Analysis (GSA) algorithm or ANOVA. This will provide you with a list of differentially expressed genes between different groups.
You can also export your quantification (gene lists) and import them into Partek® Genomics Suite®. To export your data, select either the Quantification or Feature list data nodes and click Download data located at the bottom of the context sensitive menu (Figure 6). The zipped file you download can then be opened in Partek Genomics Suite by going to File>Import>Zipped project.
Note that you do not need to unzip the downloaded file. In some web browsers such as Safari, zip files are automatically unzipped and the original unzipped file is placed in the Trash. We recommend to disable this behavior. In Safari, this is done by going to Preferences and unchecking the Open “safe” files after downloading box under the General tab.
- Combine microRNAs with their mRNA targets
- Find overrepresented microRNA target sets
- Correlate microRNA and mRNA data
- Obtain biological interpretation of targets
Refer to our microRNA tutorials for instruction on how to perform these analyses: http://partek.com/Tutorials/microarray/microRNA/miRNA_tutorial.pdf
- Langmead B, Trapnell C, Pop M, Salzberg SL. 2009. Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biol. 10:R25.
- Kozomara A and Griffiths-Jones S. 2014. miRBase: annotating high confidence microRNAs using deep sequencing data. Nucl Acid Res. 42 (D1): D68-D73.
If you need additional assistance, please visit our support page to submit a help ticket or find phone numbers for regional support.