Aggregates node-level network weights to cluster-level summaries. Computes both macro (cluster-to-cluster) transitions and per-cluster transitions (how nodes connect inside each cluster).
Usage
cluster_summary(
x,
clusters = NULL,
method = c("sum", "mean", "median", "max", "min", "density", "geomean"),
type = c("tna", "cooccurrence", "semi_markov", "raw"),
directed = TRUE,
compute_within = TRUE
)
csum(
x,
clusters = NULL,
method = c("sum", "mean", "median", "max", "min", "density", "geomean"),
type = c("tna", "cooccurrence", "semi_markov", "raw"),
directed = TRUE,
compute_within = TRUE
)Arguments
- x
Network input. Accepts multiple formats:
- matrix
Numeric adjacency/weight matrix. Row and column names are used as node labels. Values represent edge weights (e.g., transition counts, co-occurrence frequencies, or probabilities).
- cograph_network
A cograph network object. The function extracts the weight matrix from
x$weightsor converts viato_matrix(). Clusters can be auto-detected from node attributes.- tna
A tna object from the tna package. Extracts
x$weights.- cluster_summary
If already a cluster_summary, returns unchanged.
- clusters
Cluster/group assignments for nodes. Accepts multiple formats:
- NULL
(default) Auto-detect from cograph_network. Looks for columns named 'clusters', 'cluster', 'groups', or 'group' in
x$nodes. Throws an error if no cluster column is found. This option only works whenxis a cograph_network.- vector
Cluster membership for each node, in the same order as the matrix rows/columns. Can be numeric (1, 2, 3) or character ("A", "B"). Cluster names will be derived from unique values. Example:
c(1, 1, 2, 2, 3, 3)assigns first two nodes to cluster 1.- data.frame
A data frame where the first column contains node names and the second column contains group/cluster names. Example:
data.frame(node = c("A", "B", "C"), group = c("G1", "G1", "G2"))- named list
Explicit mapping of cluster names to node labels. List names become cluster names, values are character vectors of node labels that must match matrix row/column names. Example:
list(Alpha = c("A", "B"), Beta = c("C", "D"))
- method
Aggregation method for combining edge weights within/between clusters. Controls how multiple node-to-node edges are summarized:
- "sum"
(default) Sum of all edge weights. Best for count data (e.g., transition frequencies). Preserves total flow.
- "mean"
Average edge weight. Best when cluster sizes differ and you want to control for size. Note: when input is already a transition matrix (rows sum to 1), "mean" avoids size bias. Example: cluster with 5 nodes won't have 5x the weight of cluster with 1 node.
- "median"
Median edge weight. Robust to outliers.
- "max"
Maximum edge weight. Captures strongest connection.
- "min"
Minimum edge weight. Captures weakest connection.
- "density"
Sum divided by number of possible edges. Normalizes by cluster size combinations.
- "geomean"
Geometric mean of positive weights. Useful for multiplicative processes.
- type
Post-processing applied to aggregated weights. Determines the interpretation of the resulting matrices:
- "tna"
(default) Row-normalize so each row sums to 1. Creates transition probabilities suitable for Markov chain analysis. Interpretation: "Given I'm in cluster A, what's the probability of transitioning to cluster B?" Required for use with tna package functions. Diagonal is zero; per-cluster data is in
$clusters.- "raw"
No normalization. Returns aggregated counts/weights as-is. Use for frequency analysis or when you need raw counts. Compatible with igraph's contract + simplify output.
- "cooccurrence"
Symmetrize the matrix: (A + t(A)) / 2. For undirected co-occurrence analysis.
- "semi_markov"
Row-normalize with duration weighting. For semi-Markov process analysis.
- directed
Logical. If
TRUE(default), treat network as directed. A->B and B->A are separate edges. IfFALSE, edges are undirected and the matrix is symmetrized before processing.- compute_within
Logical. If
TRUE(default), compute per-cluster transition matrices for each cluster. Each cluster gets its own n_i x n_i matrix showing internal node-to-node transitions. Set toFALSEto skip this computation for better performance when only the macro (cluster-level) summary is needed.
Value
A cluster_summary object (S3 class) containing:
- macro
A tna object representing the macro (cluster-level) network:
- weights
k x k matrix of cluster-to-cluster weights, where k is the number of clusters. Row i, column j contains the aggregated weight from cluster i to cluster j. Diagonal contains aggregated intra-cluster weight (retention / self-loops). Processing depends on
type.- inits
Numeric vector of length k. Initial state distribution across clusters, computed from column sums of the original matrix. Represents the proportion of incoming edges to each cluster.
- clusters
Named list with one element per cluster. Each element is a tna object containing:
- weights
n_i x n_i matrix for nodes inside that cluster. Shows internal transitions between nodes in the same cluster.
- inits
Initial distribution for the cluster.
NULL if
compute_within = FALSE.- cluster_members
Named list mapping cluster names to their member node labels. Example:
list(A = c("n1", "n2"), B = c("n3", "n4", "n5"))- meta
List of metadata:
- type
The
typeargument used ("tna", "raw", etc.)- method
The
methodargument used ("sum", "mean", etc.)- directed
Logical, whether network was treated as directed
- n_nodes
Total number of nodes in original network
- n_clusters
Number of clusters
- cluster_sizes
Named vector of cluster sizes
See cluster_summary.
Details
This is the core function for Multi-Cluster Multi-Level (MCML) analysis.
Use as_tna to convert results to tna objects for further
analysis with the tna package.
Workflow
Typical MCML analysis workflow:
# 1. Create network
net <- cograph(edges, nodes = nodes)
net$nodes$clusters <- group_assignments
# 2. Compute cluster summary
cs <- cluster_summary(net, type = "tna")
# 3. Convert to tna models
tna_models <- as_tna(cs)
# 4. Analyze/visualize
plot(tna_models$macro)
tna::centralities(tna_models$macro)Between-Cluster Matrix Structure
The macro$weights matrix has clusters as both rows and columns:
Off-diagonal (row i, col j): Aggregated weight from cluster i to cluster j
Diagonal (row i, col i): Per-cluster total (sum of internal edges in cluster i)
When type = "tna", rows sum to 1 and diagonal values represent
"retention rate" - the probability of staying inside the same cluster.
Choosing method and type
| Input data | Recommended | Reason |
| Edge counts | method="sum", type="tna" | Preserves total flow, normalizes to probabilities |
| Transition matrix | method="mean", type="tna" | Avoids cluster size bias |
| Frequencies | method="sum", type="raw" | Keep raw counts for analysis |
| Correlation matrix | method="mean", type="raw" | Average correlations |
Examples
mat <- matrix(runif(100), 10, 10); diag(mat) <- 0
rownames(mat) <- colnames(mat) <- LETTERS[1:10]
# Membership vector
cs <- cluster_summary(mat, c(1,1,1,2,2,2,3,3,3,3))
cs$macro$weights # 3x3 cluster transition matrix
#> 1 2 3
#> 1 0.2469609 0.3005706 0.4524685
#> 2 0.3480177 0.1137586 0.5382237
#> 3 0.3351169 0.3005628 0.3643203
# Named list of clusters, TNA-normalized
clusters <- list(Alpha = LETTERS[1:3], Beta = LETTERS[4:6], Gamma = LETTERS[7:10])
cs <- cluster_summary(mat, clusters, type = "tna")
rowSums(cs$macro$weights) # all 1 (TNA probabilities)
#> Alpha Beta Gamma
#> 1 1 1
mat <- matrix(c(0.5, 0.2, 0.3, 0.1, 0.6, 0.3, 0.4, 0.1, 0.5), 3, 3,
byrow = TRUE,
dimnames = list(c("A", "B", "C"), c("A", "B", "C")))
csum(mat, list(G1 = c("A", "B"), G2 = c("C")))
#> Cluster Summary
#> ---------------
#> Type: tna
#> Method: sum
#> Clusters: 2
#> Nodes: 3
#> Cluster sizes: 2, 1
#>
#> Macro (cluster-level) weights (2x2):
#> Inits: 0.633, 0.367
#> G1 G2
#> G1 0.7 0.3
#> G2 0.5 0.5
#>
#> Per-cluster weights:
#> G1 (2 nodes)
#> G2 (1 nodes)
