Obtain and add files to the project
The project includes Human Colon Cancer (Replicate 1) and Human Colon Cancer (Replicate 2) output files in one project.
- Obtain the filtered Count matrix files (h5 or HDF5) files and Spatial outputs (Figure 1) for each sample.
- Navigate the options to select 10x Genomics Visium Space Ranger output as the file format for input.
Choose to import 10x Genomics Visium Space Ranger output for your project (Figure 2).
- Click Transfer files on the homepage, under settings, or during import (Figure 3).
Proceed to transfer files as shown below using the 10x Genomics Visium Space Ranger outputs importer (Figure 3).
- Navigate to the appropriate files for each sample (Figure 4). Please note that the 10x Genomics Space Ranger output can be count matrix data as 1 filtered .h5 file per sample or sparse matrix files for each sample as 3 files (two .csv with one .mtx or two .tsv with one .mtx for each sample). The spatial output files should be in compressed format (.zip). The high resolution image can be uploaded and is optional.
Count matrix files and spatial outputs should be included for each sample (Figure 4). Once added, the Cells and Features values will update.
Once the download completes, the sample table will appear in the Metadata tab, with one row per sample (Figure 5).
The sample table is pre-populated with sample attributes, # Cells. Sample attributes can be added and edited manually by clicking Manage in the Sample attributes menu on the left. If a new attribute is added, click Assign values to assign samples to different groups. Alternatively, you can use the Assign values from a file option to assign sample attributes using a tab-delimited text file. For more information about sample attributes, see here.
For this tutorial, we do not need to edit or change any sample attributes.
Visualize the annotated image
With samples imported and annotated, we can begin analysis.
- Click Analyses to switch to the Analyses tab
For now, the Analyses tab has only a single, circular node, Single cell counts. As you perform the analysis, additional nodes representing tasks and new data will be created, forming a visual representation of your analysis pipeline. A Spatial report task result node (rectangle) is also automatically generated for this type of data.
- Click the Spatial report node
- Click Task report on the task menu (Figure 7)
The spatial report will display the first sample (Replicate 1). We want to visualize all of the samples.
- Duplicate the plot by clicking the Duplicate plot button in the upper right controls (Figure 8, arrow 1)
- Open the Axes configuration option (Figure 8, arrow 2)
- Change the Sample on the duplicated image under Misc (Figure 8, arrow 3)
To modify the points on the image to show more of the histology use the Style configuration option (Figure 9).
- Click Style in the left panel
- Move the Opacity slider to the left
- Change the Point size to 3
To save the Data Viewer session, click Save in the left panel and give the session an appropriate name.
Performing Analysis tasks
- Click on the title of the project (Colon Cancer) to go back to the Analyses tab (Figure 10)
- Click on the Single cell counts node
An important step in analyzing single cell RNA-Seq data is to filter out low-quality cells. A few examples of low-quality cells are doublets, cells damaged during cell isolation, or cells with too few counts to be analyzed. Click here for more information on Single cell QA/QC. We will not perform Single cell QA/QC in this tutorial.
- Click the Filtering drop-down in the toolbox
- Click the Filter Features task
- Choose Noise reduction
- Exclude features where value <= 0.0 in at least 99.0% of the cells (Figure 11)
- Click Finish
Cells can be selected by setting thresholds using the Select & Filter tool. Here, we will select cells based on the total count
- Open Select & Filter under Tools on the left
- Under Criteria, Click Pin histogram to see the distribution of counts
- Set the Counts thresholds to 8000 and 20500
Selected cells will be in blue and deselected cells will be dimmed (Figure 11).
Because this data set was already filtered by the study authors to include only high-quality cells, this count filter is sufficient.
- Click under Filter to include the selected cells
- Click Apply observation filter
- Click the Single cell counts data node in the pipeline preview (Figure 12)
- Click Select
- Click on the Glioma (multi-sample) project name at the top to go back to the Analyses tab
- Your browser may warn you that any unsaved changes to the data viewer session will be lost. Ignore this message and proceed to the Analyses tab
Most tasks can be queued up on data nodes that have not yet been generated, so you can wait for filtering step to complete, or proceed to the next section.
Filtering genes in single cell RNA-Seq data
A common task in bulk and single-cell RNA-Seq analysis is to filter the data to include only informative genes. Because there is no gold standard for what makes a gene informative or not, ideal gene filtering criteria depends on your experimental design and research question. Thus, Partek Flow has a wide variety of flexible filtering options.
- Click the Filter counts node produced by the Filter counts task
- Click Filtering in the task menu
- Click Filter features (Figure 14)
There are four categories of filter available - noise reduction, statistics based, feature metadata, and feature list (Figure 15).
We will use a noise reduction filter to exclude genes that are not expressed by any cell in the data set but were included in the matrix file.
- Click the Noise reduction filter checkbox
- Set the Noise reduction filter to Exclude features where value <= 0 in 99% of cells using the drop-down menus and text boxes (Figure 16)
- Click Finish to apply the filter
This produces a Filtered counts data node. This will be the starting point for the next stage of analysis - identifying cell types in the data using the interactive t-SNE plot.
Normalizing single cell RNA-Seq data
We are omitting normalization in this tutorial because the data has already been normalized.
The tutorial data set is taken from a published study and has already been normalized using TPM (Transcripts per million), which normalizes for the length of feature and total reads, and transformed as log2(TPM/10+1). This normalization and transformation scheme can be performed in Partek Flow, along with other commonly used RNA-Seq data normalization methods.
For more information on normalizing data in Partek Flow, please see the Normalization section of the user manual.
Additional Assistance
If you need additional assistance, please visit our support page to submit a help ticket or find phone numbers for regional support.
Your Rating: | Results: | 0 | rates |