Because different samples have different numbers of total reads, it would be misleading to calculate differential expression by comparing read count numbers for genes across samples without normalization. 

The Read count normalization menu will open (Figure 2).

 

 

Normalization can be performed by sample or by feature. By sample is selected by default; this is appropriate for the tutorial data set. 

Available normalization methods are listed in the left-hand panel. For more information about these options, please see the Normalize Counts user guide.

For this tutorial, we will use the recommended default normalization settings.

This adds Total count and Add 0.0001 to the Normalization order panel (Figure 3). Normalization steps are performed in descending order

 

Total Count normalizes read counts for each gene by the total count of the sample. This accounts for differences in total read counts between samples. 

Add 0.0001 adds 0.0001 to the normalized read count of every gene. This prevents the read count data from having any 0 values. Values of 0 would prevent the gene specific analysis algorithm we will use for differential expression analysis from performing the necessary log transformation. 

A Normalize counts task node and a Normalized counts data node are added to the pipeline (Figure 4)