Builds a Multi-Cluster Multi-Level (MCML) model from raw transition data
(edge lists or sequences) by recoding node labels to cluster labels and
counting actual transitions. Unlike cluster_summary which
aggregates a pre-computed weight matrix, this function works from the
original transition data to produce the TRUE Markov chain over cluster states.
Arguments
- x
Input data. Accepts multiple formats:
- data.frame with from/to columns
Edge list. Columns named from/source/src/v1/node1/i and to/target/tgt/v2/node2/j are auto-detected. Optional weight column (weight/w/value/strength).
- data.frame without from/to columns
Sequence data. Each row is a sequence, columns are time steps. Consecutive pairs (t, t+1) become transitions.
- tna object
If
x$datais non-NULL, uses sequence path on the raw data. Otherwise falls back tocluster_summary.- cograph_network
If
x$datais non-NULL, detects edge list vs sequence data. Otherwise falls back tocluster_summary.- cluster_summary
Returns as-is.
- square numeric matrix
Falls back to
cluster_summary.- non-square or character matrix
Treated as sequence data.
- clusters
Cluster/group assignments. Accepts:
- named list
Direct mapping. List names = cluster names, values = character vectors of node labels. Example:
list(A = c("N1","N2"), B = c("N3","N4"))- data.frame
A data frame where the first column contains node names and the second column contains group/cluster names. Example:
data.frame(node = c("N1","N2","N3"), group = c("A","A","B"))- membership vector
Character or numeric vector. Node names are extracted from the data. Example:
c("A","A","B","B")- column name string
For edge list data.frames, the name of a column containing cluster labels. The mapping is built from unique (node, group) pairs in both from and to columns.
- NULL
Auto-detect from
cograph_network$nodesor$node_groups(same logic ascluster_summary).
- method
Aggregation method for combining edge weights: "sum", "mean", "median", "max", "min", "density", "geomean". Default "sum".
- type
Post-processing: "tna" (row-normalize), "cooccurrence" (symmetrize), "semi_markov", or "raw". Default "tna".
- directed
Logical. Treat as directed network? Default TRUE.
- compute_within
Logical. Compute within-cluster matrices? Default TRUE.
Value
A cluster_summary object with meta$source = "transitions",
fully compatible with plot_mcml, as_tna, and
splot.
See also
cluster_summary for matrix-based aggregation,
as_tna to convert to tna objects,
plot_mcml for visualization
Examples
# Edge list with clusters
edges <- data.frame(
from = c("A", "A", "B", "C", "C", "D"),
to = c("B", "C", "A", "D", "D", "A"),
weight = c(1, 2, 1, 3, 1, 2)
)
clusters <- list(G1 = c("A", "B"), G2 = c("C", "D"))
cs <- build_mcml(edges, clusters)
cs$macro$weights
#> G1 G2
#> G1 0.5000000 0.5000000
#> G2 0.3333333 0.6666667
# Sequence data with clusters
seqs <- data.frame(
T1 = c("A", "C", "B"),
T2 = c("B", "D", "A"),
T3 = c("C", "C", "D"),
T4 = c("D", "A", "C")
)
cs <- build_mcml(seqs, clusters, type = "raw")
cs$macro$weights
#> G1 G2
#> G1 2 2
#> G2 1 4
