By including Batch in the ANOVA model, the variability due to the batch effect is accounted for when calculating p-values for the non-random factors. In this sense, the batch effect has already been removed. However, visualizing biological effects can be very difficult if batch effects are present in the original intensity data used to generate visualizations. We can modify the original intensity data to remove the batch effect using the Remove Batch Effect tool. 

Using the Remove Batch Effect tool

The Remove Batch Effect tool functions much like ANOVA in reverse, calculating the variation attributed to the factor being removed then adjusting the original intensity values to remove the variation. Once the variation caused by the batch effect has been removed, tools like PCA or clustering can be used to visualize what the data would look like if the batch effect was not present. 

The Remove Batch Effects dialog will open. The tool functions by performing an ANOVA then modifying the original intensities values to remove the effects of the specified factor(s).  

By default, the results will be displayed in a new spreadsheet. Options to overwrite the current spreadsheet and specify the output file appear in the bottom of the dialog (Figure 2).

 

The new spreadsheet, 1-removeresult (batch-remove) will open in the Analysis tab (Figure 3).

Batch effects in PCA

We can visualize the effects of removing the batch effects using PCA.

 

The two centroids are distinct, showing the batch effect (Figure 5).

For 1-removeresult (batch-remove), the centroids of the two batches overlap, showing that the batch effect has been removed (Figure 6). 

Batch effects in ANOVA results visualizations

Visualization of ANOVA results for single probe(sets)/genes also benefits from batch removal. To illustrate this, we first need to repeat our ANOVA using the new batch-remove intesitiy values spreadsheet. 

The ANOVAResults_batch-remove spreadsheet will open in the Analysis tab. 

A dot plot for trefoil factor 1 (TFF1) will open (Figure 9). The dot plot shows gene intensity values (y-axis) for each sample. Samples are grouped by Treatment.

To visualize the batch effect we will make a few changes to the plot. 

 

 

The dot plot now clearly shows the batch effect (Figure 12). Samples within treatment groups are separated clearly between the two batches shown in blue and red. 

To view the effects of batch removal, we can view this dot plot for the ANOVAResults_batch-remove spreadsheet. 

The dot plot invoked from the ANOVAResults_batch-remove) spreadsheet shows that the batch effect has been removed as all the samples no longer clearly separate by color within treatment groups (Figure 13).