The first executable is BuildRankings.R
This script takes as input a count matrix in txt format and generates an image and a cell ranking object
#!/usr/bin/env Rscript library("optparse") library("AUCell") option_list<- list( make_option(c("-i", "--input"), type="character", default=NULL, help="input count matrix file", metavar="character"), make_option(c("-o", "--output"), type="character", default=NULL, help="output cells rankings", metavar="character"), make_option(c("--histogram"), type="character", default=NULL, help="output histogram", metavar="character") ); opt_parser <- OptionParser(option_list=option_list); opt <- parse_args(opt_parser); #load the input matrix countMatrix <- read.delim(opt$input, row.names=1, check.names=FALSE) countMatrix[is.na(countMatrix)] <- 0 # replace NaN with 0 #calculate rankings and save the Rdata cellsRankings <- AUCell_buildRankings(as.matrix(countMatrix), nCores=1, plotStats=FALSE) save(cellsRankings, file=opt$output) #plot a gene-count histogram png(opt$histogram) plotGeneCount(as.matrix(countMatrix)) dev.off() #retain cell-level information so downstream tasks can import a cell-level matrix old_ids <- file.path(dirname(file.path(opt$input)), "cellIds.txt") new_ids <- file.path(dirname(file.path(opt$output)), "cellIds.txt") file.copy(old_ids, new_ids) old_annotation <- file.path(dirname(file.path(opt$input)), "cellAnnotations.txt") new_annotation <- file.path(dirname(file.path(opt$output)), "cellAnnotations.txt") file.copy(old_annotation, new_annotation)
Save the script in ~/.partekflow/user_tasks
To add this task go to Settings > Task management
Click Add task
Click Create new task
Specify BuildRankings.R and configure how the task should appear in the task menu.
This task operates on Single cell counts
Since the R data object is not a data type that Flow recognizes we will specify the Custom type.
Images generated by the script can be shown in Flow’s task result
Each task can generate multiple outputs. We could write one task which outputs all stages of the analysis and then separate tasks to re-run the intermediate steps.
Once the task is created it will appear in the context-sensitive menu.
This task requires no extra parameters and is queued as soon as it is selected.
The task report will show the histogram:
The second script will calculate the AUC.
This script takes in the Rdata file generated by the previous task, as well as gene sets in the form of a GMT file.
The user can also configure the aucMaxRank parameter.
The output is another Rdata file as well as a single-cell matrix in a format that Flow recognizes
CalcAUC.R
#!/usr/bin/env Rscript library("optparse") library("AUCell") library("GSEABase") option_list<- list( make_option(c("-i", "--input"), type="character", default=NULL, help="input rankings", metavar="character"), make_option(c("-m", "--aucMaxRank"), type="integer", default=5, help="AUC max rank", metavar="integer"), make_option(c("-g", "--gmt"), type="character", default=NULL, help="input GMT file", metavar="character"), make_option(c("-o", "--output"), type="character", default=NULL, help="output Rdata", metavar="character"), make_option(c("-a", "--auc"), type="character", default=NULL, help="output AUC matrix", metavar="character") ); opt_parser <- OptionParser(option_list=option_list); opt <- parse_args(opt_parser); #load the input Rdata load(opt$input) #load the gene sets and calculate the AUC geneSets <- getGmt(opt$gmt) cells_AUC <- AUCell_calcAUC(geneSets, cellsRankings, aucMaxRank=opt$aucMaxRank, nCores=8) save(cells_AUC, file=opt$output) #write the static configuration file to tell Flow how to import the AUC matrix: file.copy("~/AUC.json", opt$auc) #write the AUC values: matrixFile <- file.path(dirname(file.path(opt$output)), "AUC.matrix") write.table(data.frame(getAUC(cells_AUC), check.names=FALSE), file=matrixFile, sep = "\t", quote = FALSE, row.names = FALSE, col.names = FALSE) #write the gene-set IDs featureIdFile <- file.path(dirname(file.path(opt$output)), "featureIds.txt") write(row.names(getAUC(cells_AUC)), featureIdFile) #copy the cell-level information old_ids <- file.path(dirname(file.path(opt$input)), "cellIds.txt") new_ids <- file.path(dirname(file.path(opt$output)), "cellIds.txt") file.copy(old_ids, new_ids) old_annotation <- file.path(dirname(file.path(opt$input)), "cellAnnotations.txt") new_annotation <- file.path(dirname(file.path(opt$output)), "cellAnnotations.txt") file.copy(old_annotation, new_annotation)
The content of the AUC.json file is:
{
"matrixFile":"AUC.matrix",
"cellIdFile":"cellIds.txt",
"cellAnnotationFile":"cellAnnotations.txt",
"featureIdFile":"featureIds.txt",
"cellsOnRows":"false",
"headerRows":0,
"headerColumns":0,
"featureType":"Gene set"
}
Because this task will generate a matrix, Runs on needs to be set to All samples
The input data type for this task should be Custom Data
There should be two outputs: one of types Single cell counts and one Custom
Paths to library files can be hardcoded in the script or library files managed by Flow can be specified.
This configuration enables Flow to generate an interface for configuring the task options
We can use Pre-analysis tools > Merge matrices to join the AUC Matrix with the feature counts
This enables us to, for example, compute t-SNE on the feature counts and color the plot by AUC values
We can write additional scripts & tasks to generate tables and images
ExploreThresholds.R
#!/usr/bin/env Rscript library("optparse") library("AUCell") option_list<- list( make_option(c("-i", "--input"), type="character", default=NULL, help="input count matrix file", metavar="character"), make_option(c("-o", "--output"), type="character", default=NULL, help="output cells rankings", metavar="character"), make_option(c("--histogram"), type="character", default=NULL, help="output histogram", metavar="character") ); opt_parser <- OptionParser(option_list=option_list); opt <- parse_args(opt_parser); load(opt$input) pdf(opt$histogram) cells_assignment <- AUCell_exploreThresholds(cells_AUC, plotHist=TRUE, assign=TRUE) dev.off() save(cells_assignment, file=opt$output)