At this point in analysis, you would explore the data preliminarily. Do the genes you expected to be differentially regulated appear to have larger or smaller intensity values? Do similar samples resemble each other?
The latter question can be explored using Principal Components Analysis (PCA), an excellent method for reducing and visualizing high-dimensional data.
- Select Plot PCA Scatter Plot from the QA/AC section of the Gene Expression workflow. A Scatter Plot tab containing your PCA plot will open (Figure 1)
In the scatter plot, each point represents a chip (sample) and corresponds to a row on the top-level spreadsheet. The color of the dot represents the type of the sample; red represents a normal sample and blue represents a Down syndrome sample. Points that are close together in the plot have similar intensity values across the probe sets on the whole chip (genome), and points that are far apart in the plot are dissimilar
- Left-clicking on any point in the scatter plot selects that point. A dash with an identifying row number will appear on the selected PCA plot point. The spreadsheet in the Analysis tab will also jump to the corresponding row
- While pressing the mouse wheel down, drag the mouse to rotate the plot or select the Rotate Mode icon ( ) on the left side of the Scatter Plot tab. With Rotate Mode selected, press the left mouse button and drag to rotate the plot. Rotating the plot allows you to examine the grouping pattern or outliers of the data on the first 3 principal components (PCs)
- Scrolling the mouse wheel up or down while the cursor is on the PCA plot will zoom in and out or select the Zoom Mode icon ( ) on the left side of the Scatter Plot tab
- Selecting the Reset icon ( ) option on the left side of the Scatter Plot tab will return the PCA plot to its original orientation and zoom
As you can see from rotating the plot, there is no clear separation between Down syndrome and normal samples in this data since the red and blue samples are not separated in space. However, there are other factors that may separate the data.
- In the Scatter Plot tab, select the Rendering Properties icon (Figure 2) ) and configure the plot as shown (
- Color the points by column 4. Tissue and Size the points by column 3. Type
- Select OK
- Open the Plot Rendering Properties dialog and select the Ellipsoids tab
- Select Add Ellipse/Ellipsoid
- Select Ellipse in the Add Ellipse/Ellipsoid... dialog
- Double click on Tissue in the Categorical Variable(s) panel to move it to the Grouping Variable(s) panel (Figure 4)
- Select OK to close the Add Ellipse/Ellipsoid... dialog and select OK again to exit the Plot Rendering Properties dialog
The next step is to draw a histogram to examine the samples. Select Plot Sample Histogram in the QA/QC section of the Gene Expression workflow to generate the Histogram tab (Figure 6).
The decision to discard any samples would be based on information from the PCA plot, sample histogram plot, and QC metrics. To discard a sample and renormalize the data (without the effects of the outlier), start over with importing samples and omit the outlier sample(s) during the .CEL file import.
Additional Assistance
If you need additional assistance, please visit our support page to submit a help ticket or find phone numbers for regional support.
Your Rating: | Results: | 0 | rates |