Each row of the spreadsheet (Figure 1) corresponds to a single sample. The first column is the names of the .idat files and the remaining columns are the array probes. The table values are β-values, which correspond to the percentage methylation at each site. A β-value is calculated as the ratio of methylated probe intensity over the overall intensity at each site (the overall intensity is the sum of methylated and unmethylated probe intensities).
Before we can perform any analysis, the study samples need to be organized into their experimental groups.
The Create categorical attribute dialog allows us to create groups for a categorical attribute. By default, two groups are created, but additional groups can be added.
Sample ID | Cell Type |
---|---|
GSM2452106_200483200025_R04C01 | B cells |
GSM2452107_200483200021_R01C01 | B cells |
GSM2452108_200483200021_R02C01 | B cells |
GSM2452109_200483200025_R06C01 | B cells |
GSM2452110_200483200025_R07C01 | B cells |
GSM2452111_200483200021_R08C01 | B cells |
GSM2452112_200483200021_R06C01 | B cells |
GSM2452113_200483200021_R04C01 | B cells |
GSM2452114_200483200025_R01C01 | LCLs |
GSM2452115_200483200025_R03C01 | LCLs |
GSM2452116_200483200021_R03C01 | LCLs |
GSM2452117_200483200025_R05C01 | LCLs |
GSM2452118_200483200025_R02C01 | LCLs |
GSM2452119_200483200021_R07C01 | LCLs |
GSM2452120_200483200021_R05C01 | LCLs |
GSM2452121_200483200025_R08C01 | LCLs |
There should now be two groups with eight samples in each group (Figure 3).
Sample ID | Gender |
---|---|
GSM2452106_200483200025_R04C01 | Female |
GSM2452107_200483200021_R01C01 | Female |
GSM2452108_200483200021_R02C01 | Male |
GSM2452109_200483200025_R06C01 | Female |
GSM2452110_200483200025_R07C01 | Female |
GSM2452111_200483200021_R08C01 | Female |
GSM2452112_200483200021_R06C01 | Female |
GSM2452113_200483200021_R04C01 | Male |
GSM2452114_200483200025_R01C01 | Female |
GSM2452115_200483200025_R03C01 | Female |
GSM2452116_200483200021_R03C01 | Male |
GSM2452117_200483200025_R05C01 | Female |
GSM2452118_200483200025_R02C01 | Female |
GSM2452119_200483200021_R07C01 | Female |
GSM2452120_200483200021_R05C01 | Female |
GSM2452121_200483200025_R08C01 | Male |
There should now be two groups with four samples in Male and twelve samples in Female (Figure 4).
Two new columns have been added to spreadsheet 1 (Methylation) with the cell type and gender of each sample (Figure 5).
An alternative metric for measurement of methlyation levels are M-values. β-values can be easily converted to M-values using the following equation:
M-value = log2( β / (1 - β))
An M-value close to 0 for a CpG site indicates a similar intensity between the methylated and unmethylated probes, which means the CpG site is about half-methylated. Positive M-values mean that more molecules are methylated than unmethylated, while negative M-values mean that more molecules are unmethylated than methylated. As discussed by Du and colleagues, the β-value has a more intuitive biological interpretation, but the M-value is more statistically valid for the differential analysis of methylation levels.
Because we are performing differential methylation analysis, we need to convert our data to from β-values to M-values.
The original data (β-values) will be overwritten.
|