Testing core functionalities of the ACTIONet package

Here we provide an in-depth, step-by-step guide to familiarize you with ACTIONet building blocks. For the sake of compatibility, we will use the PBMC 3k dataset used in the Seurat and Scanpy for this tutorial.

Preparing the environment

To start, you need to load the ACTIONet package.

For the ACTIONet installation instruction, see ACTIONet.

require(ACTIONet)

input.folder = "datasets/PBMC_3k/"

Preparation

Importing data from a SingleCellExperiment object

We have preprocessed and stored the PBMC 3k dataset as an SCE object, which is the main datatype used in the ACTIONet framework. For more information on how to convert different datatypes to SingleCellExperiment format, please consult our tutorial on Bring your data along – Import/export options in ACTIONet.

input.file = paste(input.folder, "pbmc3k_SCE.RDS", sep = "/")
sce = readRDS(input.file)

sce
class: SingleCellExperiment
dim: 13714 2638
metadata(0):
assays(1): logcounts
rownames(13714): AL627309.1 AP006222.2 ... PNRC2-1 SRSF10-1
rowData names(1): n.cells
colnames(2638): AAACATACAACCAC-1 AAACATTGAGCTAC-1 ... TTTGCATGAGAGGC-1
  TTTGCATGCCTCAC-1
colData names(5): nFeatures_RNA percent.mito nCount_RNA louvain ident
reducedDimNames(3): PCA TSNE UMAP
spikeNames(0):

Reducing the SCE object

Next, we use the reduce.sce() function to both normalize the counts’ matrix (lib-size normalization + log-transformation, by default) and compute a factorized (reduced) form of the kernel matrix (50 by default). By default, no batch correction is performed, and the simplest depth-normalization method is used. We then save the output for future use.

sce = reduce.sce(sce = sce)
[1] "Running main reduction"
S_r = t(reducedDims(sce)[["S_r"]])

PS: ACTIONet includes 7 additional normalization methods (scran, linnorm, scone, SCnorm, DESeq2, TMM, and logCPM). However, the default method is the fastest and performs very well for ACTIONet construction.

PS: ACTIONet also seamlessly integrates with Harmony batch-correction method. For more information on how to use batch correction please consult To batch correct or not to batch correct, that is the question!.

Creating an ACTIONetExperiment (ACE) object to hold results

We have extended the SingleCellExperiment class to incorporate additional slots, similar to AnnData format, to hold multi-dimentional and structural meta-data for rows and columns. To construct an ACE object from an SCE object, we run

ace = as(sce, "ACTIONetExperiment")

ace
class: ACTIONetExperiment
dim: 13714 2638
metadata(1): reduction.time
assays(1): logcounts
rownames(13714): AL627309.1 AP006222.2 ... PNRC2-1 SRSF10-1
rowData names(1): n.cells
colnames(2638): AAACATACAACCAC-1 AAACATTGAGCTAC-1 ... TTTGCATGAGAGGC-1
  TTTGCATGCCTCAC-1
colData names(5): nFeatures_RNA percent.mito nCount_RNA louvain ident
reducedDimNames(4): PCA TSNE UMAP S_r
spikeNames(0):
rowNets(0):
colNets(0):
rowFactors(0):
colFactors(0): 

As it can be seen, the new slots are rowNets, colNets, rowFactors, and colFactors. Each slot has its on getter/setter functions, which we will use through out the rest of the tutorial to store results.

Running ACTIONet

Running multi-level ACTION decomposition

This will run ACTIONet with increasing number of archetypes

ACTION.out = run_ACTION(S_r, k_max = 30, thread_no = 8, max_it = 50, min_delta = 0.01, 
    type = 1)

Prune nonspecific and/or unreliable archetypes

To remove unreliable archetypes and a firt pass filtering, we can use prune_archetypes() function:

pruning.out = prune_archetypes(ACTION.out$C, ACTION.out$H)

C_stacked = pruning.out$C_stacked
H_stacked = pruning.out$H_stacked

C_stacked and H_stacked are the concatenated C and H matrices across different levels, respectively, after pruning noisy archetypes.

Building ACTIONet graph

set.seed(0)
G = buildNetwork(H_stacked = H_stacked, thread_no = 8)

To store computed graph as a network associated with cells (columns) in the ACE object, we can use:

colNets(ace)$ACTIONet = G

Layout ACTIONet

We have adopted and modified the SGD-based layout algorithm utilized in the UMAP for our visualization. We use S_r for our initialization:

initial.coordinates = t(scale(reducedDims(sce)[["S_r"]]))
vis.out = layoutNetwork(G, S_r = initial.coordinates, n_epochs = 500)

And then we can store the coordinates as reducedDims() and computed de novo colors as attributes in the colData():

reducedDims(ace)$ACTIONet2D = vis.out$coordinates
reducedDims(ace)$ACTIONet3D = vis.out$coordinates_3D
ace$denovo_color = rgb(vis.out$colors)

Identiy equivalent classes of archetypes and group them together

There is a large redundancy between the set of archetyes. unify_archetypes() tries to coalesce these archetypes and partittion them into equivalent classses.

unification.out = unify_archetypes(G, S_r, C_stacked, H_stacked, minPoints = 10, 
    minClusterSize = 10, outlier_threshold = 0.5)

And store the results back in the ACE object:

colFactors(ace)[["archetype_footprint"]] = unification.out$H_unified
colFactors(ace)[["archetype_cell_contributions"]] = t(unification.out$C_unified)
ace$archetype_assignment = unification.out$sample_assignments

Use graph core of global and induced subgraphs to infer centrality/quality of each cell

We have developed a novel technique to estimate the quality of cells based on the strategic placement in the ACTIONet, w.r.t to each archetype:

ace$node_centrality = compute_archetype_core_centrality(G, ace$archetype_assignment)

Re-normalize input (~gene expression) matrix and compute feature (~gene) specificity scores

S = assays(ace)[["logcounts"]]
norm.out = ACTIONet::renormalize_input_matrix(S, unification.out$sample_assignments)
assays(ace)[["logcounts_renorm"]] = norm.out$S_norm

Compute gene specificity for each archetype

This computes the “markerness” of each genes for different archetypes:

specificity.out = compute_archetype_feature_specificity(norm.out$S_norm, unification.out$H_unified)
specificity.out = lapply(specificity.out, function(specificity.scores) {
    rownames(specificity.scores) = rownames(ace)
    colnames(specificity.scores) = paste("A", 1:ncol(specificity.scores))
    return(specificity.scores)
})
rowFactors(ace)[["archetype_gene_profile"]] = specificity.out[["archetypes"]]
rowFactors(ace)[["archetype_gene_specificity"]] = specificity.out[["upper_significance"]]

Creating a trace of all that we have done

trace = list(ACTION.out = ACTION.out, pruning.out = pruning.out, vis.out = vis.out, 
    unification.out = unification.out)
trace$log = list(genes = rownames(ace), cells = colnames(ace), time = Sys.time())

Visualization

Prior cluster assignments

plot.ACTIONet(ace, ace$louvain)

ACTIONet plot

plot.ACTIONet(ace, ace$louvain, reduction.slot = "TSNE", title = "tSNE")

plot.ACTIONet(ace, ace$louvain, reduction.slot = "UMAP", title = "UMAP")

plot.ACTIONet(ace, ace$louvain, reduction.slot = "ACTIONet2D", title = "ACTIONet")

Shahin Mohammadi

Jose Davila-Velderrain

2020-04-19