With a list of amplified or deleted regions in our cohort in hand, one of the more interesting questions to ask is what genes have recurrent amplifications or deletions in the data set. To address this question, we can use the Find overlapping genes function to either add a column to our region list with the genes present in each region or create a new list of genes that overlap the regions.

Here, we will create a new spreadsheet with genes that overlap the regions in the amplified_or_deleted spreadsheet.

To determine what regions in the genome correspond to genes, we need to select an annotation database (Figure 2).

 

Partek Genomics Suite offers a variety of possibilities including RefSeq, Ensembl, and GENCODE; however, custom annotations can also be used. If the database file has not been downloaded, Download required. Click OK to download the file, will be listed in red beneath the annotation. Selecting OK will automatically download the file and then run the task.

A new spreadsheet, gene-list, is created as a child spreadsheet of amplified_or_deleted (Figure 3).

 

Each row corresponds to a transcript and the columns are as follows:

1. Genomic coordinates of the transcript

4. Coding strand

5. Transcript ID

6. Gene Symbol

7. Minimum distance of the region to the transcription start site with positive values indicating downstream and negative values indicating upstream

8. Percent overlap with gene indicates how much of the transcript sequence overlaps the region

9. Percent overlap with region indicates how much of the region is overlapped by the transcript 

10. + Correspond to the columns 1+ in the segment-analysis spreadsheet

This gene-list spreadsheet is gene-centric and enables genomic integration. For example, GO and Pathway enrichment can be directly invoked on the gene-list spreadsheet to detect functional groups affected by copy number changes. While not detailed in this tutorial, please feel free to explore these options on your own. For rmore information on enrichment analysis, you can consult the Gene Ontology Enrichment tutorial.