Principal component analysis (PCA) can be performed to visualize clusters in the methylation data, but also serves as a quality control procedure; outliers within a group could suggest poor data quality, batch effects, mislabeled samples, or uninformative groupings.

Each dot of the plot is a single sample and represents the average methylation status across all CpG loci. Two of the LCLs samples do not cluster with the others, but we will not exclude them for this tutorial. 

 

 

Next, distribution of beta values across the samples can also be inspected by a box-and-whiskers plot. 

Each box-and-whisker is a sample and the y-axis shows beta-value ranges. Samples in this data set seem reasonably uniform (Figure 2).

 

An alternative way to take a look at the distribution of beta-values is a histogram. 

Again, no sample in the tutorial data set stands out (Figure 3).