Detect regions with copy number variation

Starting with copy number estimates for each marker (either taken directly from the vendor’s input file or calculated previously), the goal is to derive a list of regions where adjacent markers share the same copy number.

There are two algorithms available for copy number region detection: Genomic segmentation and Hidden Markov Model (HMM). Both algorithms look for trends across multiple adjacent markers. The genomic segmentation algorithm identifies breakpoints in the data, i.e., changes in copy number between two neighboring regions. The HMM algorithm looks for discrete changes of whole number copy number states (e.g., 0, 1, 2 … with no upper limit) and will find regions with those numbers of copies. Therefore, the HMM model performs better in cases of homogeneous samples where copy numbers can be anticipated such as clinical syndromes with underlying copy number aberrations. Genomic segmentation is preferable for heterogeneous samples with unpredictable copy numbers such as cancer because tumor biopsies often contain “contaminating” healthy tissue and cancer cells can have heterogeneous copy number aberrations.

The number of copies of each marker created in the previous step will be used to detect the genomic regions with copy number variation, i.e., to identify amplifications and deletions across the genome.

Select the IC_IntensitiesSNP6pairedcopynumber spreadsheet in the Analysis tab
Select Detect Amplifications and Deletions from the Copy Number Analysis section of the workflow (Figure 1)

Figure 1. Invoking Detect Amplifications and Deletions

The Detect Amplifications and Deletions dialog will give you the option to choose Genomic Segmentation or HMM Region Detection (Figure 2).

Figure 2. Select a method for detecting amplifications and deletions

Select Genomic Segmentation
Select OK

The Genomic Copy Number Segmentation dialog gives options for setting segmentation parameters and the configuring the region report (Figure 3).

Figure 3. Configuring the Genomic Copy Number Segmentation dialog

Set Minimum genomic markers to 50
Leave the rest of the parameters set to default values as shown (Figure 3)
Select OK

The Genomic Segmentation task is divided into two steps. In the first step, each region is compared to an adjacent region to determine whether both have the same average copy number and whether a breakpoint can be inserted. This task determines this by using a two-sided t-test to compare the average intensities of adjacent regions and then checks whether the corresponding cut-off p-value is below the specified P-value threshold. The genomic size of a region is defined by the numbe rof gneomic markers in the region (Minimum genomic markers), while the magnitude of the significant difference between two regions is controlled by Signal to noise, which can be thought of, if simplified, to be the difference in copy numbers between the regions. If the t-test is significant, ithe copy number of the region differs significantly from its nearest neighbors. However, a second step is needed to detemine whether the difference is due to amplificaiton or deletion. In this second stage, two one-sided t-tests are used to copare the mean copy number in the region with the expected (normal) diploid copy number. For a detailed explanation of the genomic segmenetation procedure, please consult our Genomic Segmentation white paper. For more detailed information about fine-tuning the parameters of your copy number analysis, please consult our guide, Optimizing Copy Number Segmentation.

The resulting spreadsheet, segmentation, shows one row per genomic region per sample (Figure 4). The columns provide the following information:

1-4: Genomic location of the region

5. Sample ID

6. Description of the copy number change

7. The length of the region (in base pairs)

8. The number of markers in the region

9. Markers density in the region (region length in base pairs divided by the number of markers)

10. Geometric mean of the copy number of all the markers in the region

11. Minimum p-value of the one-sided t-tests of the difference of the copy number in column 10 vs. the diploid range

Figure 4. Viewing the segmentation spreadsheet

If desired, you can use the Merge Adjacent Regions under Tools in the main toolbar to combine similar regions.

To

Additional Assistance

If you need additional assistance, please visit our support page to submit a help ticket or find phone numbers for regional support.

Your Rating:

Results:

0

rates

PGS Documentation

Page tree

Additional Assistance