During a previous section of this tutorial, a spreadsheet named unexplained_regions was generated. This spreadsheet contains locations where reads map to the genome but are not annotated by the transcript database, in this case, RefSeqGene. The unexplained_regions spreadsheet is potentially very interesting as it may contain novel findings.

Please note that it is recommended that you annotate with the same database used when you performed mRNA quantification.

The closest overlapping feature and the distance to it is now included as columns 7. Overlapping Features and 8. Nearest Features in the unexplained_regions spreadsheet. 

Right-clicking on a row header and selecting Browse to Location will show the reads mapped to the chromosome. For this tutorial, a couple of genes are selected to show regions that are located after a known gene or in the intron of a gene.

This peak may represent an extended exon (Figure 5). 

While RefSeq was used to identify overlapping features, the choice of which database to use will depend on the biological context of your experiment. For example, you may wish to utilize promoter or miRNA databases if you are interested in regulation of expression.