Partek Flow Documentation

Page tree
Skip to end of metadata
Go to start of metadata



The first executable is BuildRankings.R

This script takes as input a count matrix in txt format and generates an image and a cell ranking object


#!/usr/bin/env Rscript

library("optparse")
library("AUCell")

option_list<- list(
 make_option(c("-i", "--input"), type="character", default=NULL, help="input count matrix file", metavar="character"),
 make_option(c("-o", "--output"), type="character", default=NULL, help="output cells rankings", metavar="character"),
 make_option(c("--histogram"), type="character", default=NULL, help="output histogram", metavar="character")
);

opt_parser <- OptionParser(option_list=option_list);
opt <- parse_args(opt_parser);

#load the input matrix
countMatrix <- read.delim(opt$input, row.names=1, check.names=FALSE)
countMatrix[is.na(countMatrix)] <- 0 # replace NaN with 0

#calculate rankings and save the Rdata
cellsRankings <- AUCell_buildRankings(as.matrix(countMatrix), nCores=1, plotStats=FALSE)
save(cellsRankings, file=opt$output)

#plot a gene-count histogram
png(opt$histogram)
plotGeneCount(as.matrix(countMatrix))
dev.off()

#retain cell-level information so downstream tasks can import a cell-level matrix
old_ids <- file.path(dirname(file.path(opt$input)), "cellIds.txt")
new_ids <- file.path(dirname(file.path(opt$output)), "cellIds.txt")
file.copy(old_ids, new_ids)

old_annotation <- file.path(dirname(file.path(opt$input)), "cellAnnotations.txt")
new_annotation <- file.path(dirname(file.path(opt$output)), "cellAnnotations.txt")
file.copy(old_annotation, new_annotation)

Save the script in ~/.partekflow/user_tasks


To add this task go to Settings > Task management

Click Add task

Click Create new task


Specify BuildRankings.R and configure how the task should appear in the task menu.



This task operates on Single cell counts


Since the R data object is not a data type that Flow recognizes we will specify the Custom type.


Images generated by the script can be shown in Flow’s task result


Each task can generate multiple outputs.  We could write one task which outputs all stages of the analysis and then separate tasks to re-run the intermediate steps.


Once the task is created it will appear in the context-sensitive menu.


This task requires no extra parameters and is queued as soon as it is selected.


The task report will show the histogram:


The second script will calculate the AUC.

This script takes in the Rdata file generated by the previous task, as well as gene sets in the form of a GMT file.

The user can also configure the aucMaxRank parameter.

The output is another Rdata file as well as a single-cell matrix in a format that Flow recognizes



CalcAUC.R


#!/usr/bin/env Rscript

library("optparse")
library("AUCell")
library("GSEABase") 

option_list<- list(
 make_option(c("-i", "--input"), type="character", default=NULL, help="input rankings", metavar="character"),
 make_option(c("-m", "--aucMaxRank"), type="integer", default=5, help="AUC max rank", metavar="integer"),
 make_option(c("-g", "--gmt"), type="character", default=NULL, help="input GMT file", metavar="character"),
 make_option(c("-o", "--output"), type="character", default=NULL, help="output Rdata", metavar="character"),
 make_option(c("-a", "--auc"), type="character", default=NULL, help="output AUC matrix", metavar="character")
);

opt_parser <- OptionParser(option_list=option_list);
opt <- parse_args(opt_parser);

#load the input Rdata
load(opt$input)

#load the gene sets and calculate the AUC
geneSets <- getGmt(opt$gmt)
cells_AUC <- AUCell_calcAUC(geneSets, cellsRankings, aucMaxRank=opt$aucMaxRank, nCores=8)
save(cells_AUC, file=opt$output)

#write the static configuration file to tell Flow how to import the AUC matrix:
file.copy("~/AUC.json", opt$auc)

#write the AUC values:
matrixFile <- file.path(dirname(file.path(opt$output)), "AUC.matrix")
write.table(data.frame(getAUC(cells_AUC), check.names=FALSE), file=matrixFile, sep = "\t", quote = FALSE, row.names = FALSE, col.names = FALSE)

#write the gene-set IDs
featureIdFile <- file.path(dirname(file.path(opt$output)), "featureIds.txt")
write(row.names(getAUC(cells_AUC)), featureIdFile)

#copy the cell-level information
old_ids <- file.path(dirname(file.path(opt$input)), "cellIds.txt")
new_ids <- file.path(dirname(file.path(opt$output)), "cellIds.txt")
file.copy(old_ids, new_ids)

old_annotation <- file.path(dirname(file.path(opt$input)), "cellAnnotations.txt")
new_annotation <- file.path(dirname(file.path(opt$output)), "cellAnnotations.txt")
file.copy(old_annotation, new_annotation)



The content of the AUC.json file is:

{

 "matrixFile":"AUC.matrix",

 "cellIdFile":"cellIds.txt",

 "cellAnnotationFile":"cellAnnotations.txt",

 "featureIdFile":"featureIds.txt",

 "cellsOnRows":"false",

 "headerRows":0,

 "headerColumns":0,

 "featureType":"Gene set"

}


Because this task will generate a matrix, Runs on needs to be set to All samples


The input data type for this task should be Custom Data


There should be two outputs:  one of types Single cell counts and one Custom


Paths to library files can be hardcoded in the script or library files managed by Flow can be specified.



This configuration enables Flow to generate an interface for configuring the task options


We can use Pre-analysis tools > Merge matrices to join the AUC Matrix with the feature counts

This enables us to, for example, compute t-SNE on the feature counts and color the plot by AUC values



We can write additional scripts & tasks to generate tables and images


ExploreThresholds.R

#!/usr/bin/env Rscript

library("optparse")
library("AUCell")

option_list<- list(
 make_option(c("-i", "--input"), type="character", default=NULL, help="input count matrix file", metavar="character"),
 make_option(c("-o", "--output"), type="character", default=NULL, help="output cells rankings", metavar="character"),
 make_option(c("--histogram"), type="character", default=NULL, help="output histogram", metavar="character")
);

opt_parser <- OptionParser(option_list=option_list);
opt <- parse_args(opt_parser);

load(opt$input)

pdf(opt$histogram)
cells_assignment <- AUCell_exploreThresholds(cells_AUC, plotHist=TRUE, assign=TRUE) 
dev.off()

save(cells_assignment, file=opt$output)




  • No labels