Transition Network Analysis (TNA) Tutorial¶

Open In Colab

Introduction¶

Transition Network Analysis (TNA) represents a novel methodological approach that captures the temporal and relational dynamics of unfolding processes. The core principle involves representing transition matrices between events as graphs, enabling researchers to leverage graph theory and network analysis comprehensively.

TNA functions as a sophisticated combination of process mining and network analysis. Where process mining typically generates sequential maps, TNA represents these through network analysis — but with considerably greater analytical depth. The method applies network analysis to capture structure, time, and relationships holistically. Compared to traditional process mining models, TNA incorporates network measures at node, edge, and graph levels, revealing which events hold importance through centrality measures, which transitions prove central, and which processes demonstrate greater connectivity. The method extends beyond standard network analysis by clustering sub-networks into different network constellations representing typical temporal event patterns — often called tactics.

A distinctive innovation involves statistical validation techniques unavailable in conventional approaches. These include edge verification through bootstrapping, network comparison via permutation testing, and centrality verification through case-dropping methods. These statistical techniques introduce rigor and validation at each analytical step, enabling researchers to verify which edges demonstrate replicability and confirm that inferences remain valid rather than chance artifacts.

Why TNA?¶

Learning operates as a complex dynamic system — a collection of interconnected components interacting across time where interactions can enhance, impede, amplify, or reinforce each other. These dynamic interactions generate emergent behaviors that resist full understanding through analyzing individual components in isolation. Such interactions frequently produce processes exceeding the simple sum of their parts, exhibiting non-linear dynamics.

For example, motivation catalyzes achievement, which subsequently catalyzes enhanced engagement, enjoyment, and motivation. These interdependencies, feedback loops, and non-linear dynamics create inherent complexity requiring modeling methods transcending traditional linear approaches. TNA, functioning as a dynamic probabilistic model, addresses these limitations by capturing uncertainties through directional probabilities between learning events. The method accommodates the non-linear, evolving character of learning processes while capturing the constellations and emergent patterns defining or shaping learning processes.

The Building Blocks of TNA¶

TNA's foundational elements are transitions between events comprising transition processes. A transition represents a conditional relationship between one occurrence and another — from A to B (a contingency). TNA models transitions in sequential data to compute transition probabilities between events. The resulting transition matrix becomes a weighted directed network where weights represent transition probabilities between events and direction indicates transition direction.

  • Nodes (V) represent different learning events — watching videos, taking quizzes, submitting assignments — or alternatively, states, dialogue moves, collaborative roles, motivation states, or any event representable as sequence units.
  • Edges (E) represent transitions between activities, displaying direction from one activity to the next.
  • Weights (W) represent transitioning probabilities between events or states.

This tutorial demonstrates the complete TNA workflow using the Python tna package — from data preparation through model building, visualization, pruning, pattern detection, centrality analysis, community detection, bootstrapping, and group comparison. This tutorial replicates the R TNA tutorial by Saqr & Lopez-Pernas (2025) using the Python implementation.

1. Installation & Setup¶

TNA can analyze any sequence-representable data with transitions or changes across time — learning event sequences, states, phases, roles, dialogue moves, or interactions. This data can originate from time-stamped learning management system data, coded interaction data, event-log data, or ordered event data.

Install the tna package and import the required libraries:

In [1]:
# Install tna package (uncomment for Google Colab)
# !pip install tna-py

import tna
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

%matplotlib inline
plt.rcParams['figure.dpi'] = 150

print(f"TNA version: {tna.__version__}")
TNA version: 0.1.0

2. Getting Started with Long-Format Data¶

TNA works with sequential event data. The tna package accepts sequence data in several formats: a wide DataFrame where rows represent sequences and columns represent timepoints, a transition matrix, or long-format event data that gets reshaped using prepare_data().

The built-in dataset contains coded collaborative regulation behaviors from learning sessions, with columns for action, actor, and time. Let's start by loading the long-format dataset:

In [2]:
# Load the built-in dataset of coded collaborative regulation behaviors
group_regulation_long = tna.load_group_regulation_long()
print(f"Shape: {group_regulation_long.shape}")
group_regulation_long.head(10)
Shape: (27533, 6)
Out[2]:
Actor Achiever Group Course Time Action
0 1 High 1.0 A 2025-01-01 08:27:07.712698221 cohesion
1 1 High 1.0 A 2025-01-01 08:35:20.712698221 consensus
2 1 High 1.0 A 2025-01-01 08:42:18.712698221 discuss
3 1 High 1.0 A 2025-01-01 08:50:00.712698221 synthesis
4 1 High 1.0 A 2025-01-01 08:52:25.712698221 adapt
5 1 High 1.0 A 2025-01-01 08:57:31.712698221 consensus
6 1 High 1.0 A 2025-01-01 08:58:04.712698221 plan
7 1 High 1.0 A 2025-01-01 09:05:00.712698221 consensus
8 2 High 1.0 A 2025-01-01 08:27:33.712698221 plan
9 2 High 1.0 A 2025-01-01 08:33:45.712698221 emotion

Each row is a single event with columns:

  • Action: The behavioral state (becomes a network node)
  • Actor: Participant ID (one sequence per actor)
  • Time: Timestamp (for ordering and session splitting)
  • Achiever: Achievement group (High/Low, used later for group comparison)
  • Group: Group identifier
  • Course: Course identifier

3. Understanding prepare_data()¶

The prepare_data() function converts long-format event logs into sequences suitable for TNA. It handles session splitting (based on time gaps), ordering, and reshaping. To generate individual sequences for each actor, you must specify both the actor and action columns.

When timestamps are provided via the time column, events happening less than 15 minutes apart are grouped in the same sequence, while events occurring after a longer gap mark the start of a new sequence (session). You can customize this gap using the time_threshold argument (in minutes).

An important advantage of using prepare_data() prior to constructing the TNA model is that you get to keep other variables of the data (metadata) and use them in your analysis. For instance, you can use group_tna() to create a TNA model by achievement group by passing the result of prepare_data() and indicating the name of the grouping column.

In [3]:
# Convert long-format event log into sequences for TNA
prepared_data = tna.prepare_data(
    group_regulation_long,
    action="Action",   # column with behavioral states (become network nodes)
    actor="Actor",     # column with participant IDs (one sequence per actor)
    time="Time"        # column with timestamps (for ordering and session splitting)
)
prepared_data
Out[3]:
TNAData(sessions=2000, actions=9, actors=2000)
In [4]:
# View the wide-format sequence data (rows = sequences, columns = positions)
print("Sequence data shape:", prepared_data.sequence_data.shape)
prepared_data.sequence_data.head()
Sequence data shape: (2000, 26)
Out[4]:
action_1 action_2 action_3 action_4 action_5 action_6 action_7 action_8 action_9 action_10 ... action_17 action_18 action_19 action_20 action_21 action_22 action_23 action_24 action_25 action_26
.session_id
1000_1 discuss discuss consensus plan cohesion consensus discuss consensus plan plan ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
1001_1 cohesion consensus plan plan monitor plan consensus discuss consensus plan ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
1002_1 discuss adapt cohesion consensus discuss emotion cohesion coregulate discuss discuss ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
1003_1 discuss emotion cohesion consensus coregulate coregulate plan plan consensus coregulate ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
1004_1 cohesion plan consensus plan consensus discuss discuss synthesis consensus discuss ... cohesion coregulate consensus consensus coregulate discuss NaN NaN NaN NaN

5 rows × 26 columns

In [5]:
# View the preserved metadata (e.g., Achiever group) for each sequence
prepared_data.meta_data.head()
Out[5]:
Actor Achiever Group Course Time
.session_id
1000_1 1000 High 100.0 B 2025-01-01 09:12:00.562642574
1001_1 1001 Low 101.0 B 2025-01-01 09:18:40.756721020
1002_1 1002 Low 101.0 B 2025-01-01 09:18:53.756721020
1003_1 1003 Low 101.0 B 2025-01-01 09:18:05.756721020
1004_1 1004 Low 101.0 B 2025-01-01 09:22:26.756721020

Alternative Input Formats¶

In addition to long-format data processed via prepare_data(), TNA models can be built directly from:

  • Wide-format data: A DataFrame where each row is a sequence and each column represents a time step. This is the most straightforward format when sequences are already aligned.
  • Pre-computed transition matrices: A square DataFrame or NumPy array where entry (i, j) represents the transition probability or frequency from state i to state j.

These alternative inputs provide flexibility for researchers who already have their data in processed formats:

In [6]:
# Wide-format data (rows = sequences, columns = time steps)
group_regulation = tna.load_group_regulation()
print("Wide-format shape:", group_regulation.shape)
group_regulation.head()
Wide-format shape: (2000, 26)
Out[6]:
T1 T2 T3 T4 T5 T6 T7 T8 T9 T10 ... T17 T18 T19 T20 T21 T22 T23 T24 T25 T26
0 cohesion consensus discuss synthesis adapt consensus plan consensus NaN NaN ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
1 plan emotion consensus discuss synthesis adapt emotion NaN NaN NaN ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
2 consensus coregulate monitor consensus plan emotion consensus monitor consensus coregulate ... plan plan consensus monitor consensus plan emotion plan consensus discuss
3 monitor emotion plan discuss synthesis consensus discuss cohesion consensus plan ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
4 discuss emotion cohesion NaN NaN NaN NaN NaN NaN NaN ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN

5 rows × 26 columns

In [7]:
# Pre-computed transition matrix
mat = np.array([
    [0.1, 0.6, 0.3],
    [0.4, 0.2, 0.4],
    [0.3, 0.3, 0.4]
])
labels = ["A", "B", "C"]
model_from_matrix = tna.tna(pd.DataFrame(mat, index=labels, columns=labels))
print(model_from_matrix)
TNA Model
  Type: relative
  States: ['A', 'B', 'C']
  Scaling: none

Transition Matrix:
     A    B    C
A  0.1  0.6  0.3
B  0.4  0.2  0.4
C  0.3  0.3  0.4

Initial Probabilities:
       prob
A  0.333333
B  0.333333
C  0.333333

Importing One-Hot Encoded Data¶

Some datasets encode states as binary (0/1) indicator columns rather than categorical labels — for example, coded observation data where each column indicates whether a particular behavior was present in each time interval. The import_onehot() function converts this one-hot format into wide-format sequence data suitable for tna().

The function supports windowing to group multiple time intervals together:

  • window_size: Number of rows per window (default: 1, each row becomes one time step)
  • window_type: 'tumbling' (non-overlapping chunks) or 'sliding' (step-by-1 overlap)
  • aggregate: If True, collapse each window to the first active state per column (reduces width)

When actor or session columns are provided, windowing is applied within each group, producing one row per actor/session with all windows concatenated.

In [8]:
# Create example one-hot encoded data (e.g., coded classroom observations)
onehot_data = pd.DataFrame({
    "actor": ["s1"] * 6 + ["s2"] * 6,
    "Reading":  [1, 0, 0, 1, 0, 0,  0, 1, 0, 0, 1, 0],
    "Writing":  [0, 1, 0, 0, 1, 0,  1, 0, 0, 1, 0, 0],
    "Discuss":  [0, 0, 1, 0, 0, 1,  0, 0, 1, 0, 0, 1],
})
print("One-hot input:")
print(onehot_data)

# Convert to wide-format sequences (one row per actor)
states = ["Reading", "Writing", "Discuss"]
wide_seq = tna.import_onehot(onehot_data, cols=states, actor="actor")
print("\nWide-format output:")
wide_seq
One-hot input:
   actor  Reading  Writing  Discuss
0     s1        1        0        0
1     s1        0        1        0
2     s1        0        0        1
3     s1        1        0        0
4     s1        0        1        0
5     s1        0        0        1
6     s2        0        1        0
7     s2        1        0        0
8     s2        0        0        1
9     s2        0        1        0
10    s2        1        0        0
11    s2        0        0        1

Wide-format output:
Out[8]:
W0_T1 W0_T2 W0_T3 W1_T1 W1_T2 W1_T3 W2_T1 W2_T2 W2_T3 W3_T1 W3_T2 W3_T3 W4_T1 W4_T2 W4_T3 W5_T1 W5_T2 W5_T3
0 Reading NaN NaN NaN Writing NaN NaN NaN Discuss Reading NaN NaN NaN Writing NaN NaN NaN Discuss
1 NaN Writing NaN Reading NaN NaN NaN NaN Discuss NaN Writing NaN Reading NaN NaN NaN NaN Discuss

4. Building the TNA Model¶

TNA analysis begins by building the primary TNA object (called model), containing all information necessary for further analysis — plotting, centrality estimation, or comparison. TNA model estimation employs the tna() function, which estimates Markov models from data where initial and transition probabilities derive directly from observed initial state probabilities and transition frequencies.

The resulting model contains:

  • Initial Probabilities (inits): Define the likelihood of starting in a particular state at the beginning of the process (the first time point, before transitions). In educational contexts, initial probability represents the probability that students begin in specific states (such as "engaged" or "motivated") before activities or interventions occur. These probabilities provide a process snapshot showing student starting positions.

  • Transition Probabilities (weights): Describe state-to-state movement likelihoods at each process step. Transition probabilities capture how students transition, move, or follow between different learning states or events. Each row of the transition matrix sums to 1, representing a complete probability distribution over next states.

  • Labels (labels): Provide descriptive network node names, enhancing analysis interpretability. Labels automatically derive from the data categories.

  • Data (data): The sequence data used to build the model, stored internally for further analysis (permutation testing, bootstrapping, etc.).

In [9]:
# Build the TNA model from the prepared sequence data
model = tna.tna(prepared_data)
print(model)
TNA Model
  Type: relative
  States: ['adapt', 'cohesion', 'consensus', 'coregulate', 'discuss', 'emotion', 'monitor', 'plan', 'synthesis']
  Scaling: none

Transition Matrix:
               adapt  cohesion  consensus  coregulate   discuss   emotion   monitor      plan  synthesis
adapt       0.000000  0.273084   0.477407    0.021611  0.058939  0.119843  0.033399  0.015717   0.000000
cohesion    0.002950  0.027139   0.497935    0.119174  0.059587  0.115634  0.033038  0.141003   0.003540
consensus   0.004740  0.014852   0.082003    0.187707  0.188023  0.072681  0.046611  0.395797   0.007584
coregulate  0.016244  0.036041   0.134518    0.023350  0.273604  0.172081  0.086294  0.239086   0.018782
discuss     0.071374  0.047583   0.321185    0.084282  0.194887  0.105796  0.022273  0.011643   0.140977
emotion     0.002467  0.325344   0.320409    0.034191  0.101868  0.076842  0.036306  0.099753   0.002820
monitor     0.011165  0.055827   0.159107    0.057920  0.375436  0.090719  0.018144  0.215632   0.016050
plan        0.000975  0.025175   0.290401    0.017216  0.067890  0.146825  0.075524  0.374208   0.001787
synthesis   0.234663  0.033742   0.466258    0.044479  0.062883  0.070552  0.012270  0.075153   0.000000

Initial Probabilities:
              prob
adapt       0.0115
cohesion    0.0605
consensus   0.2140
coregulate  0.0190
discuss     0.1755
emotion     0.1515
monitor     0.1440
plan        0.2045
synthesis   0.0195
In [10]:
# Inspect the transition probability matrix
weights_df = model.to_dataframe()
weights_df.round(3)
Out[10]:
adapt cohesion consensus coregulate discuss emotion monitor plan synthesis
adapt 0.000 0.273 0.477 0.022 0.059 0.120 0.033 0.016 0.000
cohesion 0.003 0.027 0.498 0.119 0.060 0.116 0.033 0.141 0.004
consensus 0.005 0.015 0.082 0.188 0.188 0.073 0.047 0.396 0.008
coregulate 0.016 0.036 0.135 0.023 0.274 0.172 0.086 0.239 0.019
discuss 0.071 0.048 0.321 0.084 0.195 0.106 0.022 0.012 0.141
emotion 0.002 0.325 0.320 0.034 0.102 0.077 0.036 0.100 0.003
monitor 0.011 0.056 0.159 0.058 0.375 0.091 0.018 0.216 0.016
plan 0.001 0.025 0.290 0.017 0.068 0.147 0.076 0.374 0.002
synthesis 0.235 0.034 0.466 0.044 0.063 0.071 0.012 0.075 0.000
In [11]:
# Inspect initial probabilities
init_df = pd.Series(model.inits, index=model.labels, name="Initial Probability")
init_df.round(3)
Out[11]:
adapt         0.012
cohesion      0.060
consensus     0.214
coregulate    0.019
discuss       0.176
emotion       0.152
monitor       0.144
plan          0.204
synthesis     0.020
Name: Initial Probability, dtype: float64
In [12]:
# Model summary
model.summary()
Out[12]:
{'n_states': 9,
 'type': 'relative',
 'scaling': [],
 'n_edges': np.int64(78),
 'density': np.float64(0.9629629629629629),
 'mean_weight': np.float64(0.1153846153846154),
 'max_weight': np.float64(0.49793510324483775),
 'has_self_loops': np.True_}

5. Visualizations¶

TNA model visualization enables bird's-eye views of learning processes, capturing the full structure essence, event connectivity, important pattern identification, and temporal event relationships. TNA provides powerful visualization features with several enhancements for comparing and exploring networks.

5.1 Transition Network Plot¶

The network plot represents a directed weighted network where each node (state, event, or learning activity) appears as a colored circle. Node-to-node arrows represent weighted transition probabilities with direction showing transition routes. Loops represent identical state repetition probabilities. Edge width and opacity reflect transition probability — thicker, more opaque edges indicate stronger transitions.

The plot_network() function provides two key parameters for managing visual complexity:

  • minimum: Hides edges below this weight entirely, removing visual clutter. Note that these small probabilities remain in the model for all subsequent computations — this is purely a visual filter.
  • cut: Fades edges below this weight (reduced opacity) but still shows them, allowing researchers to see the full network while emphasizing stronger transitions.
In [13]:
# minimum: hide edges below 0.05; cut: fade edges below 0.1
tna.plot_network(model, minimum=0.05, cut=0.1)
plt.show()
No description has been provided for this image

5.2 Histogram of Edge Weights¶

Examining the distribution of transition probabilities helps researchers understand the overall structure of the network — whether transitions are uniformly distributed or concentrated among a few strong connections. This informs decisions about pruning thresholds and helps identify the natural "backbone" of the network:

In [14]:
tna.plot_histogram(model)
plt.show()
No description has been provided for this image

5.3 Frequency Distribution of States¶

The frequency distribution shows how often each state appears as the first event in a sequence, reflecting the initial state probabilities. This helps identify which states learners most commonly begin with and provides context for interpreting the transition network:

In [15]:
# Bar chart of how often each state appears across all sequences
tna.plot_frequencies(model)
plt.show()
No description has been provided for this image

5.4 Mosaic Plot¶

The mosaic (marimekko) plot visualizes the transition matrix as a contingency table. Tile widths are proportional to column totals (incoming transitions) and tile heights are proportional to row proportions (outgoing transitions). Colors represent adjusted standardized residuals from a chi-squared test — blue tiles indicate more transitions than expected, red tiles indicate fewer. This requires a frequency model built with ftna():

In [16]:
# Build frequency model and plot mosaicfmodel = tna.ftna(prepared_data)tna.plot_mosaic(fmodel)plt.show()

6. Pruning¶

Transition networks commonly appear fully connected or saturated — where nearly all nodes connect to all other nodes with some probability. Therefore, mechanisms must retrieve the network core or backbone structure, making networks sparse. Network sparsity enhances interpretability by removing overly complex structures, simplifying important component and relationship identification. It also isolates signal from noise, removing small noisy edges that obscure meaningful patterns, allowing researchers to focus on important interactions.

While researchers can use the minimum argument in plot_network() to visually hide small edges, those small probabilities remain in the model for all subsequent computations. Researchers who want to actually remove negligible-weight edges from the model can use the prune() function, which retains only strong, meaningful connections.

The prune() function implements threshold-based pruning: edges below a specified threshold value are set to zero (default threshold is 0.05). This provides a clean model where only meaningful transitions remain for downstream analysis.

Pruning with TNA can also be accomplished through bootstrapping (demonstrated in the bootstrapping section below), which offers a statistically grounded approach to identifying and eliminating small and uncertain edges.

In [17]:
# Prune: remove edges with weight below 0.05
pruned = tna.prune(model, threshold=0.05)

print(f"Original edges: {model.summary()['n_edges']}")
print(f"Pruned edges:   {pruned.summary()['n_edges']}")
Original edges: 78
Pruned edges:   46
In [18]:
# Plot the pruned network
tna.plot_network(pruned, cut=0.1)
plt.show()
No description has been provided for this image

7. Patterns: Cliques¶

Patterns help understand behavior, identify significant structures, and describe processes in detail. Patterns form fundamental building blocks of structure and learning process dynamics. They furnish insights into behavior and learner strategies during studying or learning material interaction. Furthermore, capturing repeated consistent patterns enables theory building and generalizable inferences.

TNA supports identifying several n-clique pattern types. Network cliques comprise graph node subsets where every node pair connects directly through edges. In network terms, cliques represent tightly-knit communities, closely related entities, or interdependent nodes shaping learning unfolding.

The cliques() function identifies n-cliques from TNA models. Its arguments include:

  • size: The clique size to search for (size=2 finds dyads, size=3 finds triads, etc.)
  • threshold: The minimum edge weight required for an edge to participate in a clique

Dyads represent TNA's simplest patterns — transitions between two nodes. Mutual dyads (bidirectional) with high edge weights indicate strong interdependence through recurrent occurrence. For instance, consistently moving from reading materials to quiz-taking indicates strong self-evaluative strategies.

Triads capture more complex three-node relationships. In TNA, three-node cliques where each connects to the others in either direction indicate strong interdependent node subgroups forming a process core. Triads represent higher-order learning behavior dependencies.

We search for cliques of size 2, 3, and 4 with decreasing thresholds (larger cliques are rarer, so lower thresholds are needed):

In [19]:
# Find cliques of size 2, 3, and 4 with decreasing thresholds
cliques_of_two   = tna.cliques(model, size=2, threshold=0.1)   # dyads
cliques_of_three = tna.cliques(model, size=3, threshold=0.05)  # triads
cliques_of_four  = tna.cliques(model, size=4, threshold=0.03)  # quads
In [20]:
print(cliques_of_two)
Cliques of size 2 (threshold=0.1)
Number of cliques found: 5

Clique 1: cohesion, emotion
  cohesion: 0.027  0.116
  emotion: 0.325  0.077

Clique 2: consensus, coregulate
  consensus: 0.082  0.188
  coregulate: 0.135  0.023

Clique 3: consensus, discuss
  consensus: 0.082  0.188
  discuss: 0.321  0.195

Clique 4: consensus, plan
  consensus: 0.082  0.396
  plan: 0.290  0.374

Clique 5: discuss, emotion
  discuss: 0.195  0.106
  emotion: 0.102  0.077

In [21]:
print(cliques_of_three)
Cliques of size 3 (threshold=0.05)
Number of cliques found: 3

Clique 1: consensus, coregulate, discuss
  consensus: 0.082  0.188  0.188
  coregulate: 0.135  0.023  0.274
  discuss: 0.321  0.084  0.195

Clique 2: consensus, discuss, emotion
  consensus: 0.082  0.188  0.073
  discuss: 0.321  0.195  0.106
  emotion: 0.320  0.102  0.077

Clique 3: consensus, emotion, plan
  consensus: 0.082  0.073  0.396
  emotion: 0.320  0.077  0.100
  plan: 0.290  0.147  0.374

In [22]:
print(cliques_of_four)
Cliques of size 4 (threshold=0.03)
Number of cliques found: 5

Clique 1: cohesion, coregulate, discuss, emotion
  cohesion: 0.027  0.119  0.060  0.116
  coregulate: 0.036  0.023  0.274  0.172
  discuss: 0.048  0.084  0.195  0.106
  emotion: 0.325  0.034  0.102  0.077

Clique 2: cohesion, coregulate, emotion, monitor
  cohesion: 0.027  0.119  0.116  0.033
  coregulate: 0.036  0.023  0.172  0.086
  emotion: 0.325  0.034  0.077  0.036
  monitor: 0.056  0.058  0.091  0.018

Clique 3: consensus, coregulate, discuss, emotion
  consensus: 0.082  0.188  0.188  0.073
  coregulate: 0.135  0.023  0.274  0.172
  discuss: 0.321  0.084  0.195  0.106
  emotion: 0.320  0.034  0.102  0.077

Clique 4: consensus, coregulate, emotion, monitor
  consensus: 0.082  0.188  0.073  0.047
  coregulate: 0.135  0.023  0.172  0.086
  emotion: 0.320  0.034  0.077  0.036
  monitor: 0.159  0.058  0.091  0.018

Clique 5: consensus, emotion, monitor, plan
  consensus: 0.082  0.073  0.047  0.396
  emotion: 0.320  0.077  0.036  0.100
  monitor: 0.159  0.091  0.018  0.216
  plan: 0.290  0.147  0.076  0.374

8. Centralities¶

Centrality measures quantify the role or importance of states or events in processes. With centrality measures, researchers can rank events by their value in bridging interactions (betweenness centrality) or receiving the most transitions (in-strength centrality). Centrality measures reveal which behaviors or cognitive states prove central to learning processes — as frequent transition destinations, starting points for various actions, bridges between learning activities, or keys to spreading phenomena. Using centrality measures, researchers can identify important events to target for intervention or improvement.

Importantly, raw or absolute centrality measure values lack inherent meaning in TNA. Relative values matter instead, allowing node ranking and relative importance identification within networks.

8.1 Node-Level Centrality Measures¶

The centralities() function computes centrality measures using directed probabilistic process algorithms. By default, it removes loops from calculations (changeable via loops=True). Removing loops means all centrality computations proceed without considering self-transitioning or same-state repetition.

Available measures include:

  • OutStrength / InStrength: Sum of outgoing/incoming transition probabilities. In pruned networks where self-loops are removed, out-strength reflects state stability — higher values indicate greater likelihood of transitioning away.
  • Closeness / InCloseness: How quickly a state can reach (or be reached from) all other states.
  • Betweenness: How often a state lies on shortest paths between other states, measuring its bridging role.
  • BetweennessRSP: Betweenness based on randomized shortest paths — more appropriate for probabilistic networks.
  • Diffusion: Measures how efficiently information or influence spreads from a state.
  • Clustering: Local clustering coefficient reflecting the interconnectedness of a state's neighbors.
In [23]:
# Compute all centrality measures for each state
centrality_df = tna.centralities(model)
centrality_df.round(4)
Out[23]:
OutStrength InStrength ClosenessIn ClosenessOut Closeness Betweenness BetweennessRSP Diffusion Clustering
adapt 1.0000 0.3446 0.0083 0.0152 0.0248 1.0 1.0 5.5863 0.3370
cohesion 0.9729 0.8116 0.0138 0.0124 0.0265 0.0 19.0 5.2086 0.2996
consensus 0.9180 2.6672 0.0351 0.0125 0.0383 30.0 103.0 4.6597 0.1608
coregulate 0.9766 0.5666 0.0155 0.0150 0.0210 0.0 27.0 5.1479 0.3058
discuss 0.8051 1.1882 0.0196 0.0131 0.0271 16.0 53.0 4.6276 0.2397
emotion 0.9232 0.8941 0.0141 0.0121 0.0231 5.0 36.0 5.0699 0.2905
monitor 0.9819 0.3457 0.0076 0.0137 0.0193 0.0 11.0 5.1568 0.2889
plan 0.6258 1.1938 0.0274 0.0115 0.0274 9.0 61.0 3.4875 0.2875
synthesis 1.0000 0.1915 0.0100 0.0158 0.0243 7.0 3.0 5.5825 0.3586
In [24]:
# Plot centralities as faceted bar charts
tna.plot_centralities(centrality_df)
plt.show()
No description has been provided for this image

8.2 Edge-Level Measures: Edge Betweenness¶

In TNA, edge centrality measures quantify the importance of transitions between events — rather than the events themselves — furnishing insights into particular transitions' criticality for process flow. Edge betweenness centrality reflects how frequently a transition bridges other transitions in the network.

Edge centrality measures help researchers understand not only which nodes are important but which transitions guide learning processes. For instance, a transition from "planning" to "task execution" might have high edge betweenness, indicating it serves as a critical bridge in the learning process.

The betweenness_network() function creates a new TNA model where edge weights are replaced with their betweenness centrality values:

In [25]:
# Compute edge betweenness for all transitions
edge_betweenness = tna.betweenness_network(model)

# Show the betweenness values
edge_betweenness.to_dataframe().round(3)
Out[25]:
adapt cohesion consensus coregulate discuss emotion monitor plan synthesis
adapt 0.0 2.0 6.0 0.0 0.0 1.0 0.0 0.0 0.0
cohesion 0.0 0.0 7.0 0.0 0.0 1.0 0.0 0.0 0.0
consensus 0.0 0.0 0.0 8.0 15.0 0.0 0.0 15.0 0.0
coregulate 0.0 0.0 0.0 0.0 4.0 2.0 1.0 1.0 0.0
discuss 0.0 0.0 7.0 0.0 0.0 2.0 0.0 0.0 15.0
emotion 0.0 6.0 7.0 0.0 0.0 0.0 0.0 0.0 0.0
monitor 0.0 0.0 0.0 0.0 5.0 2.0 0.0 1.0 0.0
plan 0.0 0.0 5.0 0.0 0.0 5.0 7.0 0.0 0.0
synthesis 9.0 0.0 6.0 0.0 0.0 0.0 0.0 0.0 0.0
In [26]:
# Plot edge betweenness network
tna.plot_network(edge_betweenness, cut=0.1, title="Edge Betweenness Network")
plt.show()
No description has been provided for this image

8.3 Centrality Stability¶

Centrality stability assessment determines whether centrality rankings remain consistent when cases are progressively dropped from the data. The estimate_cs() function implements the case-dropping bootstrap approach: it repeatedly drops increasing proportions of cases (10% to 90%) and recalculates centralities, measuring rank-order correlation with the original.The CS coefficient represents the maximum proportion of cases that can be dropped while maintaining a correlation above 0.7 with at least 95% certainty. CS values above 0.5 indicate stable centrality rankings, while values below 0.25 suggest instability:

In [27]:
# Centrality stability: case-dropping bootstrapcs_result = tna.estimate_cs(model, iter=200, seed=42)print("CS coefficients:", cs_result.cs_coefficients)

9. Community Detection¶

Communities comprise nodes more closely related or densely interconnected together than to other network nodes. In TNA, communities group states or events that frequently transition between one another or share similar dynamics. Communities represent cohesive sequences or activity successions that are more likely to co-occur, revealing typical pathways or recurring behaviors.

Unlike cliques — which maintain fixed or predefined structures (2-cliques or 3-cliques) — communities are data-driven based on connectivity patterns, making them more descriptive of real-world structures. Community identification uncovers latent or hidden clusters of related interaction or behavior during learning. Identifying these clusters provides insight into collaboration and learning effectiveness, common regulatory practices, or interaction patterns.

Furthermore, identifying behavior or event communities can contribute to theory building and learning understanding. These communities represent underlying interaction pattern inferences from densely connected behaviors into simplified meaningful structures, suggesting the presence of underlying constructs or behavioral mechanisms.

The communities() function supports several detection algorithms suited for transition networks (typically small, weighted, and directed):

  • Leading Eigenvector (leading_eigen): Uses the leading eigenvector of the modularity matrix to partition nodes. This is the default method.
  • Fast Greedy (fast_greedy): Optimizes modularity by iteratively merging communities.
  • Louvain (louvain): A multi-level modularity optimization algorithm.
  • Label Propagation (label_prop): Each node adopts the most common community among its neighbors.
  • Edge Betweenness (edge_betweenness): Iteratively removes high-betweenness edges to reveal communities.
In [28]:
# Detect communities using the default algorithm (leading eigenvector)
comms = tna.communities(model)
print(comms)
Community Detection Results

  leading_eigen: 2 communities

Assignments:
            leading_eigen
adapt                   0
cohesion                0
consensus               0
coregulate              1
discuss                 1
emotion                 0
monitor                 1
plan                    1
synthesis               0
In [29]:
# Plot communities: nodes colored by community assignment
tna.plot_communities(comms, cut=0.1)
plt.show()
No description has been provided for this image
In [30]:
# Try multiple community detection methods
comms_multi = tna.communities(model, methods=["leading_eigen", "louvain", "fast_greedy"])
print(comms_multi)
Community Detection Results

  leading_eigen: 2 communities
  louvain: 2 communities
  fast_greedy: 2 communities

Assignments:
            leading_eigen  louvain  fast_greedy
adapt                   0        0            0
cohesion                0        0            0
consensus               0        0            0
coregulate              1        1            1
discuss                 1        1            1
emotion                 0        0            0
monitor                 1        1            1
plan                    1        0            0
synthesis               0        0            0

10. Bootstrapping¶

10.1 Why Bootstrap?¶

Bootstrapping represents a robust validation technique for assessing edge-weight accuracy and stability, consequently validating entire models. Through bootstrapping, researchers verify each edge, determine statistical significance, and obtain transition probability confidence intervals. Most network or process mining research employs descriptive methods — model validation or statistical significance proving remain largely absent from the literature. Validated models enable researchers to assess robustness and reproducibility, ensuring insights arise not from chance and therefore remain generalizable.

Bootstrapping — a resampling technique — involves repeatedly drawing samples from original datasets with replacement to estimate models for each sample (usually hundreds or thousands of times). Bootstrapping requires no strong data distribution assumptions, rendering it suitable for process data analysis that often does not adhere to specific distributions. Given bootstrap replacement, each sample may include multiple copies of some observations while excluding others, assessing parameter estimate variability. Edges consistently appearing across most estimated models prove stable and significant.

Another key bootstrap advantage involves effectively pruning dense networks. One challenge in probabilistic networks like TNA involves common complete connection — meaning every possible node connection exists to some degree. Bootstrapping mitigates this by identifying and eliminating small and uncertain edges, effectively retrieving the network backbone. The resulting simplified network proves easier to interpret and more likely to be generalizable.

The bootstrap_tna() function calculates confidence intervals and p-values for each edge weight. The function features a default of 1000 bootstrap iterations (via the iter argument). The level argument sets the significance threshold (e.g., 0.05) — if edges consistently appear above this threshold in bootstrapped samples, they are deemed statistically significant.

In [31]:
# Resample sequences 1000 times and assess edge stability
np.random.seed(265)  # for reproducibility
boot = tna.bootstrap_tna(model, iter=1000, level=0.05, seed=265)

10.2 Results¶

The bootstrap result contains several elements:

  • weights_sig: A matrix showing only statistically significant transitions (non-significant weights set to zero)
  • weights_mean: Mean transition matrix across all bootstrap samples
  • weights_sd: Standard deviation matrix across all bootstrap samples
  • ci_lower / ci_upper: Bootstrap confidence interval bounds for each transition
  • p_values: Bootstrap p-value matrix for each transition

The summary() method returns a convenient DataFrame with all of these statistics per edge:

In [32]:
# Extract the bootstrap summary table
boot_df = boot.summary()
boot_df.head(10)
Out[32]:
from to weight p_value sig cr_lower cr_upper ci_lower ci_upper
0 cohesion adapt 0.002950 0.480519 False 0.002212 0.003687 0.000594 0.005700
1 consensus adapt 0.004740 0.173826 False 0.003555 0.005925 0.003130 0.006352
2 coregulate adapt 0.016244 0.156843 False 0.012183 0.020305 0.010810 0.022272
3 discuss adapt 0.071374 0.000999 True 0.053531 0.089218 0.063041 0.079214
4 emotion adapt 0.002467 0.577423 False 0.001851 0.003084 0.000720 0.004458
5 monitor adapt 0.011165 0.287712 False 0.008374 0.013957 0.006206 0.016417
6 plan adapt 0.000975 0.550450 False 0.000731 0.001218 0.000320 0.001848
7 synthesis adapt 0.234663 0.000999 True 0.175997 0.293328 0.202432 0.267998
8 adapt cohesion 0.273084 0.001998 True 0.204813 0.341356 0.236137 0.310988
9 cohesion cohesion 0.027139 0.108891 False 0.020354 0.033923 0.019429 0.035850
In [33]:
# Keep only edges that survived the bootstrap and sort by weight
sig_edges = boot_df[boot_df["sig"] == True].sort_values("weight", ascending=False)
print(f"{len(sig_edges)} out of {len(boot_df)} edges are significant")
sig_edges.head(15)
51 out of 78 edges are significant
Out[33]:
from to weight p_value sig cr_lower cr_upper ci_lower ci_upper
18 cohesion consensus 0.497935 0.000999 True 0.373451 0.622419 0.475015 0.521243
17 adapt consensus 0.477407 0.000999 True 0.358055 0.596758 0.431948 0.520174
25 synthesis consensus 0.466258 0.000999 True 0.349693 0.582822 0.426488 0.505009
64 consensus plan 0.395797 0.000999 True 0.296848 0.494746 0.383257 0.407910
41 monitor discuss 0.375436 0.000999 True 0.281577 0.469295 0.352214 0.401393
69 plan plan 0.374208 0.000999 True 0.280656 0.467760 0.362409 0.386589
13 emotion cohesion 0.325344 0.000999 True 0.244008 0.406680 0.308078 0.342700
21 discuss consensus 0.321185 0.000999 True 0.240888 0.401481 0.306922 0.336784
22 emotion consensus 0.320409 0.000999 True 0.240307 0.400511 0.302499 0.337725
24 plan consensus 0.290401 0.000999 True 0.217801 0.363001 0.278729 0.301506
38 coregulate discuss 0.273604 0.000999 True 0.205203 0.342005 0.254195 0.291943
8 adapt cohesion 0.273084 0.001998 True 0.204813 0.341356 0.236137 0.310988
65 coregulate plan 0.239086 0.000999 True 0.179315 0.298858 0.218761 0.258977
7 synthesis adapt 0.234663 0.000999 True 0.175997 0.293328 0.202432 0.267998
68 monitor plan 0.215632 0.000999 True 0.161724 0.269539 0.194958 0.236826

10.3 Bootstrapped Network¶

The bootstrapped model (boot.model) contains only statistically significant edges — those that survived the bootstrap validation. Plotting this model shows the validated network backbone, which is more likely to generalize to new data:

In [34]:
# Plot the bootstrapped network (only significant edges)
tna.plot_network(boot.model, cut=0.1, title="Bootstrapped Network (significant edges)")
plt.show()
No description has been provided for this image

11. Sequence Plots¶

Sequence plots provide a direct visualization of the raw sequential data before it is aggregated into a transition network. These visualizations help researchers understand the variety and structure of individual sequences.

Two plot types are available:

  • Index plot: Each row represents one sequence, with colors indicating the state at each position. This reveals the diversity and patterns in individual trajectories — whether sequences are highly varied or follow common templates.
  • Distribution plot: Shows the proportion of each state at each sequence position, revealing how the state distribution evolves over time. This helps identify whether certain states dominate at the beginning or end of sequences.
In [35]:
# Each row is one sequence; colors represent states at each position
tna.plot_sequences(prepared_data, max_sequences=200)
plt.show()
No description has been provided for this image
In [36]:
# Proportion of each state at each sequence position
tna.plot_sequences(prepared_data, plot_type="distribution")
plt.show()
No description has been provided for this image

12. Group Models¶

Researchers frequently encounter predefined conditions — high versus low achievers, different course types, or gender groups. Comparing such groups has commonly occurred visually — comparing process models or sequence models. While visual comparison may reveal differences, it fails to indicate statistical significance. Where precisely differences prove statistically significant and where they do not remains unclear.

TNA addresses this by enabling rigorous systematic group comparison. The group_tna() function builds separate TNA models for each level of a grouping variable. The metadata preserved by prepare_data() (e.g., the Achiever column) can be used directly as the grouping variable — no manual data splitting needed.

All standard TNA functions (centralities(), prune(), communities(), cliques(), plot_network()) work seamlessly with group models, automatically applying per-group and returning combined results. This enables researchers to examine how transition dynamics differ across subgroups without writing any group-splitting code.

In [37]:
# Build group models directly from the prepared data using the Achiever metadata column
group_model = tna.group_tna(prepared_data, group="Achiever")
print(group_model)
print()

# Summary statistics per group
group_model.summary()
GroupTNA with 2 groups:
  High: 9 states, 76 edges
  Low: 9 states, 75 edges

Out[37]:
n_states type scaling n_edges density mean_weight max_weight has_self_loops
group
High 9 relative [] 76 0.938272 0.118421 0.575540 True
Low 9 relative [] 75 0.925926 0.120000 0.461957 True
In [38]:
# Access individual models using dict-style indexing
print(group_model["High"])
print()
print("Group names:", group_model.names())
TNA Model
  Type: relative
  States: ['adapt', 'cohesion', 'consensus', 'coregulate', 'discuss', 'emotion', 'monitor', 'plan', 'synthesis']
  Scaling: none

Transition Matrix:
               adapt  cohesion  consensus  coregulate   discuss   emotion   monitor      plan  synthesis
adapt       0.000000  0.262411   0.517730    0.000000  0.035461  0.141844  0.028369  0.014184   0.000000
cohesion    0.005330  0.043710   0.536247    0.081023  0.040512  0.118337  0.017058  0.151386   0.006397
consensus   0.004127  0.019752   0.083432    0.170991  0.232606  0.081368  0.035377  0.364387   0.007960
coregulate  0.022371  0.035794   0.108501    0.013423  0.234899  0.203579  0.096197  0.266219   0.019016
discuss     0.023964  0.061907   0.424863    0.072391  0.169246  0.112332  0.016475  0.012481   0.106340
emotion     0.003226  0.325806   0.336129    0.023226  0.121935  0.062581  0.031613  0.090323   0.005161
monitor     0.011058  0.048973   0.159558    0.050553  0.369668  0.096367  0.018957  0.225908   0.018957
plan        0.001383  0.031466   0.293914    0.023859  0.060166  0.182227  0.075726  0.327801   0.003458
synthesis   0.143885  0.028777   0.575540    0.014388  0.028777  0.064748  0.000000  0.143885   0.000000

Initial Probabilities:
             prob
adapt       0.012
cohesion    0.082
consensus   0.212
coregulate  0.005
discuss     0.180
emotion     0.169
monitor     0.129
plan        0.188
synthesis   0.023

Group names: ['High', 'Low']
In [39]:
# Plot all group networks side by side (automatic multi-panel)
tna.plot_network(group_model, minimum=0.05, cut=0.1)
plt.show()
No description has been provided for this image
In [40]:
# Prune all groups at once — returns a new GroupTNA with pruned models
pruned_group = tna.prune(group_model, threshold=0.05)
print(pruned_group)

# Compare edge counts
for name in group_model:
    orig = group_model[name].summary()["n_edges"]
    prun = pruned_group[name].summary()["n_edges"]
    print(f"  {name}: {orig} → {prun} edges")
GroupTNA with 2 groups:
  High: 9 states, 42 edges
  Low: 9 states, 48 edges
  High: 76 → 42 edges
  Low: 75 → 48 edges
In [41]:
# Centralities across groups — returns a single DataFrame with a 'group' column
group_cent = tna.centralities(group_model, measures=["OutStrength", "InStrength", "Betweenness"])
group_cent
Out[41]:
group OutStrength InStrength Betweenness
adapt High 1.000000 0.215346 1.0
cohesion High 0.956290 0.814888 0.0
consensus High 0.916568 2.952482 30.0
coregulate High 0.986577 0.436432 0.0
discuss High 0.830754 1.124025 16.0
emotion High 0.937419 1.000801 5.0
monitor High 0.981043 0.300815 0.0
plan High 0.672199 1.268773 11.0
synthesis High 1.000000 0.167289 7.0
adapt Low 1.000000 0.452279 4.0
cohesion Low 0.993395 0.798538 0.0
consensus Low 0.919646 2.415674 25.0
coregulate Low 0.968401 0.689185 3.0
discuss Low 0.778747 1.214121 14.0
emotion Low 0.905983 0.807917 3.0
monitor Low 0.982500 0.392745 0.0
plan Low 0.584686 1.146514 8.0
synthesis Low 1.000000 0.216385 0.0
In [42]:
# Communities per group
group_comms = tna.communities(group_model)
for name, result in group_comms.items():
    print(f"{name}: {result.counts}")
    print(result.assignments)
    print()
High: {'leading_eigen': 2}
            leading_eigen
adapt                   1
cohesion                1
consensus               1
coregulate              0
discuss                 0
emotion                 1
monitor                 0
plan                    0
synthesis               1

Low: {'leading_eigen': 2}
            leading_eigen
adapt                   0
cohesion                0
consensus               0
coregulate              1
discuss                 1
emotion                 1
monitor                 1
plan                    1
synthesis               0

12.1 Permutation Testing Between Groups¶

To address the limitations of simple visual comparison, TNA employs rigorous permutation-based approaches for determining whether observed differences between group models are statistically significant. Permutation tests involve repeatedly shuffling the data between groups and generating a distribution of differences under the null hypothesis. For each edge, the test provides p-values helping researchers identify statistically significant differences. This rigorous approach ensures TNA insights reflect true underlying differences rather than chance artifacts.

The permutation_test() function compares two TNA models by shuffling sequences between groups for a specified number of iterations (iter), creating a null distribution of edge-weight differences. Edges where the observed difference exceeds the permutation distribution are flagged as statistically significant.

Access individual group models with dict-style indexing to compare specific groups:

In [43]:
# Permutation test: compare High vs Low achievers
perm_result = tna.permutation_test(
    group_model["High"], group_model["Low"],
    iter=500, seed=42, level=0.05
)

# Show significant edge differences
sig_perm = perm_result.edges["stats"][
    perm_result.edges["stats"]["p_value"] < 0.05
].sort_values("p_value")

print(f"{len(sig_perm)} significant edge differences found")
sig_perm
42 significant edge differences found
Out[43]:
edge_name diff_true effect_size p_value
38 consensus -> discuss 0.096072 9.800758 0.001996
76 discuss -> synthesis -0.070251 -6.280752 0.001996
71 synthesis -> plan 0.119821 5.892463 0.001996
70 plan -> plan -0.087513 -6.955485 0.001996
65 consensus -> plan -0.067687 -5.392796 0.001996
56 consensus -> monitor -0.024207 -4.640464 0.001996
55 cohesion -> monitor -0.035783 -3.962377 0.001996
52 plan -> emotion 0.066760 7.404535 0.001996
48 coregulate -> emotion 0.057669 3.479972 0.001996
41 emotion -> discuss 0.044235 3.682664 0.001996
40 discuss -> discuss -0.052006 -4.281572 0.001996
39 coregulate -> discuss -0.070863 -3.543739 0.001996
37 cohesion -> discuss -0.042712 -3.940997 0.001996
32 emotion -> coregulate -0.024171 -3.693955 0.001996
79 plan -> synthesis 0.003152 2.898779 0.001996
10 cohesion -> cohesion 0.037105 4.679806 0.001996
13 discuss -> cohesion 0.029053 4.234649 0.001996
19 cohesion -> consensus 0.085785 3.377973 0.001996
22 discuss -> consensus 0.210284 14.505543 0.001996
26 synthesis -> consensus 0.190513 4.781482 0.001996
8 synthesis -> adapt -0.158254 -4.827859 0.001996
11 consensus -> cohesion 0.010559 3.786004 0.001996
28 cohesion -> coregulate -0.085423 -5.210336 0.001996
29 consensus -> coregulate -0.036023 -3.396109 0.001996
4 discuss -> adapt -0.096159 -11.276056 0.001996
35 synthesis -> coregulate -0.052456 -3.108636 0.003992
44 synthesis -> discuss -0.059458 -3.182909 0.003992
47 consensus -> emotion 0.018719 2.944978 0.003992
21 coregulate -> consensus -0.047633 -3.175563 0.003992
30 coregulate -> coregulate -0.018176 -2.675244 0.003992
34 plan -> coregulate 0.012527 3.697559 0.003992
16 plan -> cohesion 0.011864 2.942219 0.003992
50 emotion -> emotion -0.031436 -2.888892 0.005988
66 coregulate -> plan 0.049676 2.613874 0.009980
31 discuss -> coregulate -0.024118 -2.733761 0.009980
77 emotion -> synthesis 0.005161 2.661137 0.009980
62 synthesis -> monitor -0.021390 -2.597739 0.011976
58 discuss -> monitor -0.011759 -2.517596 0.013972
43 plan -> discuss -0.014566 -2.266660 0.019960
27 adapt -> coregulate -0.029891 -2.360321 0.023952
73 cohesion -> synthesis 0.006397 2.195816 0.033932
3 coregulate -> adapt 0.011219 1.999853 0.049900

12.2 Difference Network¶

The plot_compare() function visualizes the difference between two TNA models as a network. Green edges indicate transitions that are stronger in the first model, red edges indicate transitions stronger in the second. Edge width is proportional to the absolute difference. Node colors reflect differences in initial probabilities:

In [44]:
# Difference network: High vs Low achievers
tna.plot_compare(group_model["High"], group_model["Low"])
plt.show()
No description has been provided for this image

12.3 Comparing Sequence Patterns¶

Beyond edge-level permutation tests and difference networks, TNA offers a complementary approach for group comparison that operates at the subsequence level. The compare_sequences() function extracts n-gram patterns — contiguous subsequences of length 1 (unigrams), 2 (bigrams), 3 (trigrams), and so on — from each group's raw sequences and compares their frequencies.

This approach answers a different question than permutation testing. Where permutation tests ask "do these two groups differ in their transition probabilities?", sequence pattern comparison asks "do these two groups differ in the specific behavioral sequences they produce?" For example, a bigram "plan->consensus" captures a concrete two-step behavior, while a transition probability plan→consensus is an aggregate over all sequences.

The sub parameter controls which subsequence lengths to examine (default: 1 through 5), and min_freq filters out rare patterns that appear fewer than a specified number of times in any group. When test=True, a permutation test shuffles group labels to assess whether the observed frequency differences are statistically significant, with p-values adjusted for multiple comparisons (Bonferroni by default).

In [45]:
# Compare subsequence patterns between High and Low achievers
cs_result = tna.compare_sequences(group_model)
print(f"Total patterns found: {len(cs_result)}")
print(f"Columns: {list(cs_result.columns)}")
print()

# Show unigrams (single-state frequencies)
unigrams = cs_result[~cs_result['pattern'].str.contains('->')]
print(f"Unigram patterns ({len(unigrams)} states):")
print(unigrams.to_string(index=False))
print()

# Show top bigrams by frequency difference
bigrams = cs_result[cs_result['pattern'].str.count('->') == 1].copy()
bigrams['prop_diff'] = (bigrams['prop_High'] - bigrams['prop_Low']).abs()
print(f"Top 10 bigrams by proportion difference ({len(bigrams)} total):")
bigrams.nlargest(10, 'prop_diff')[['pattern', 'freq_High', 'freq_Low', 'prop_High', 'prop_Low']]
Total patterns found: 918
Columns: ['pattern', 'freq_High', 'freq_Low', 'prop_High', 'prop_Low']

Unigram patterns (9 states):
   pattern  freq_High  freq_Low  prop_High  prop_Low
     adapt        155       399   0.011297  0.028888
  cohesion       1018       821   0.074193  0.059441
 consensus       3651      3146   0.266088  0.227773
coregulate        959      1174   0.069893  0.084999
   discuss       2166      2101   0.157860  0.152114
   emotion       1686      1389   0.122877  0.100565
   monitor        668       848   0.048684  0.061396
      plan       3102      3521   0.226077  0.254923
 synthesis        316       413   0.023030  0.029902

Top 10 bigrams by proportion difference (67 total):
Out[45]:
pattern freq_High freq_Low prop_High prop_Low
40 discuss->consensus 851 418 0.066897 0.032626
69 plan->plan 948 1356 0.074522 0.105838
24 consensus->discuss 789 401 0.062023 0.031299
38 discuss->adapt 48 234 0.003773 0.018264
14 cohesion->consensus 503 341 0.039541 0.026616
67 plan->emotion 527 377 0.041428 0.029426
48 emotion->consensus 521 388 0.040956 0.030284
46 discuss->synthesis 213 344 0.016744 0.026850
33 coregulate->discuss 210 329 0.016508 0.025679
10 adapt->consensus 73 170 0.005739 0.013269
In [46]:
# Permutation test for statistical significance (100 iterations for speed)
cs_test = tna.compare_sequences(group_model, test=True, iter_=100, seed=42)

# Show patterns with smallest p-values
print("Patterns with smallest adjusted p-values:")
cs_test.head(15)[['pattern', 'freq_High', 'freq_Low', 'prop_High', 'prop_Low', 'effect_size', 'p_value']]
Patterns with smallest adjusted p-values:
Out[46]:
pattern freq_High freq_Low prop_High prop_Low effect_size p_value
0 adapt 155 399 0.011297 0.028888 15.838830 0.089109
1 cohesion 1018 821 0.074193 0.059441 7.453087 0.089109
2 consensus 3651 3146 0.266088 0.227773 14.827639 0.089109
3 coregulate 959 1174 0.069893 0.084999 6.394814 0.089109
4 emotion 1686 1389 0.122877 0.100565 11.488188 0.089109
5 monitor 668 848 0.048684 0.061396 6.937058 0.089109
6 plan 3102 3521 0.226077 0.254923 7.138419 0.089109
7 synthesis 316 413 0.023030 0.029902 4.255809 0.089109
8 adapt->cohesion 37 102 0.002909 0.007961 8.969993 0.772277
9 adapt->consensus 73 170 0.005739 0.013269 8.197910 0.772277
10 adapt->discuss 5 25 0.000393 0.001951 5.433557 0.772277
11 cohesion->cohesion 41 5 0.003223 0.000390 8.096168 0.772277
12 cohesion->consensus 503 341 0.039541 0.026616 9.826334 0.772277
13 cohesion->monitor 16 40 0.001258 0.003122 4.703406 0.772277
14 cohesion->plan 142 97 0.011163 0.007571 3.205518 0.772277

13. Sequence Clustering (Tactics)¶

13.1 Why Cluster Sequences?¶

The analyses presented so far — transition networks, centralities, communities, bootstrapping, and group comparison — all operate at the network level, characterizing the aggregate dynamics of how states connect and transition. However, within any dataset, individual sequences often exhibit substantial heterogeneity. Not all learners follow the same behavioral patterns. Some may consistently cycle between planning and execution, while others predominantly engage in monitoring with occasional social interactions.

Sequence clustering addresses this heterogeneity by grouping individual sequences (learners, sessions, or actors) into clusters of similar behavioral trajectories — often called tactics or strategies in educational research. This complements network-level analysis by revealing the diversity of approaches within a population. Where the overall TNA model shows the average transition structure, tactics reveal the distinct behavioral patterns that compose that average.

This distinction from community detection (Section 9) is important. Communities group states that frequently co-transition within the network — identifying which behaviors tend to co-occur. Tactics group entire sequences — identifying which learners behave similarly across their full trajectory. A learner's tactic reflects their overall strategy, while communities reflect structural relationships among behaviors.

Identifying tactics serves several research purposes:

  • Typology development: Discovering naturally occurring behavioral patterns supports theory building about learning strategies, self-regulation approaches, or collaborative styles.
  • Intervention targeting: Different tactics may require different interventions. Learners who predominantly monitor but rarely plan may benefit from different support than those who cycle rapidly between all states.
  • Outcome prediction: Tactics can predict learning outcomes — some behavioral patterns may consistently associate with higher or lower achievement.
  • Group comparison via TNA: Perhaps most powerfully, discovered tactics can serve as grouping variables for further TNA analysis, building separate transition networks per tactic to examine how transition dynamics differ across behavioral strategies.

13.2 Distance Metrics¶

Sequence clustering requires measuring how similar or different two sequences are. The cluster_sequences() function supports four distance metrics, each capturing different aspects of sequence similarity:

  • Hamming distance ('hamming'): Counts the number of positions where two sequences differ. Fast and intuitive, but requires sequences of equal length and treats all positions equally. Best suited for aligned sequences where positional correspondence matters — for example, comparing what students did at time step 1, time step 2, etc.

  • Levenshtein distance ('lv'): The minimum number of insertions, deletions, and substitutions needed to transform one sequence into another. Handles unequal-length sequences naturally. Appropriate when the exact temporal alignment is less important than the overall ordering of events.

  • Optimal String Alignment ('osa'): Extends Levenshtein distance by also allowing adjacent transpositions (swapping two neighboring elements). Useful when near-swaps should be considered minor differences — for example, if planning-then-executing versus executing-then-planning represents a small rather than large behavioral difference.

  • Longest Common Subsequence ('lcs'): Measures distance as the length difference after removing the longest shared subsequence. Focuses on what two sequences have in common regardless of position. Particularly appropriate when the presence of certain behavioral patterns matters more than their exact timing.

For most TNA applications with aligned sequences from prepare_data(), Hamming distance provides a good default. For sequences of varying length or when positional alignment is uncertain, Levenshtein or LCS distances are more appropriate.

In [47]:
# Cluster sequences into 3 tactics using PAM with Hamming distance
clust = tna.cluster_sequences(prepared_data, k=3, dissimilarity="hamming", method="pam")
print(clust)
print(f"\nCluster sizes: {clust.sizes}")
print(f"Silhouette score: {clust.silhouette:.4f}")
ClusterResult(k=3, method='pam', dissimilarity='hamming', silhouette=0.1904)

Cluster sizes: [893 713 394]
Silhouette score: 0.1904

13.3 Clustering Methods¶

Two families of clustering methods are available:

Partitioning Around Medoids (PAM) (method='pam') is the default and generally recommended method. PAM identifies medoid sequences — actual sequences from the data that best represent each cluster center. Unlike k-means (which uses abstract centroids), PAM's medoids are real, interpretable sequences that researchers can examine directly. PAM is also more robust to outliers than centroid-based methods.

Hierarchical clustering builds a tree-like dendrogram by iteratively merging the most similar sequences or clusters. The tree is then cut at the desired number of clusters. Several linkage methods control how inter-cluster distances are computed:

  • 'complete' — maximum distance between any pair across clusters (produces compact, spherical clusters)
  • 'average' — mean distance between all pairs across clusters (balanced approach)
  • 'ward.D' / 'ward.D2' — minimizes within-cluster variance (tends to produce equal-sized clusters)
  • 'single' — minimum distance between any pair across clusters (can produce elongated, chain-like clusters)

The choice of method and distance metric can substantially affect results. Comparing multiple configurations and evaluating silhouette scores helps identify the most meaningful clustering for your data.

In [48]:
# Compare distance metrics (using first 200 sequences for slower metrics)
subset = prepared_data.sequence_data.iloc[:200]
print("Distance metric comparison (PAM, k=3, n=200):")
for metric in ["hamming", "lcs", "osa"]:
    result = tna.cluster_sequences(subset, k=3, dissimilarity=metric)
    print(f"  {metric:>7s}: sizes={result.sizes}, silhouette={result.silhouette:.4f}")

print("\nLinkage method comparison (Hamming, k=3, full data):")
for method in ["pam", "complete", "average", "ward.D2"]:
    result = tna.cluster_sequences(prepared_data, k=3, method=method)
    print(f"  {method:>8s}: sizes={result.sizes}, silhouette={result.silhouette:.4f}")
Distance metric comparison (PAM, k=3, n=200):
  hamming: sizes=[77 53 70], silhouette=0.1372
      lcs: sizes=[74 80 46], silhouette=0.2048
      osa: sizes=[78 76 46], silhouette=0.1496

Linkage method comparison (Hamming, k=3, full data):
       pam: sizes=[893 713 394], silhouette=0.1904
  complete: sizes=[1639  108  253], silhouette=0.2831
   average: sizes=[1768  230    2], silhouette=0.3064
   ward.D2: sizes=[1130  377  493], silhouette=0.2354

13.4 Choosing the Number of Clusters¶

Selecting the appropriate number of clusters is a critical decision. The silhouette score measures how well each sequence fits its assigned cluster compared to the nearest alternative cluster. Values range from -1 (poor fit) to +1 (excellent fit), with higher mean silhouette scores indicating better-separated, more cohesive clusters.

A practical approach is to compute silhouette scores for a range of k values and select the k that maximizes the score — or the k where the score plateaus, indicating diminishing returns from additional clusters. Domain knowledge should also inform the decision: the identified clusters should be interpretable and meaningful in the research context.

In [49]:
# Sweep k values and compare silhouette scores
print("Silhouette scores for different k values:")
for k in range(2, 6):
    result = tna.cluster_sequences(prepared_data, k=k)
    print(f"  k={k}: silhouette={result.silhouette:.4f}, sizes={result.sizes}")
Silhouette scores for different k values:
  k=2: silhouette=0.1839, sizes=[ 982 1018]
  k=3: silhouette=0.1904, sizes=[893 713 394]
  k=4: silhouette=0.1161, sizes=[606 455 351 588]
  k=5: silhouette=0.1277, sizes=[330 475 220 657 318]

13.5 Building TNA Models per Tactic¶

The most powerful application of sequence clustering in TNA is using the discovered tactics as grouping variables to build separate transition networks per cluster. This reveals how transition dynamics differ across behavioral strategies — for example, whether learners in a "monitoring-heavy" tactic show different transition patterns than those in a "planning-heavy" tactic.

The workflow is straightforward: cluster the sequences, assign each sequence to its tactic, and use group_tna() with the tactic labels as the grouping variable. All standard TNA functions — centralities, pruning, communities, bootstrapping, permutation testing — then work seamlessly on the tactic-based group model.

In [50]:
# Step 1: Cluster sequences into tactics
clust = tna.cluster_sequences(prepared_data, k=3, dissimilarity="hamming", method="pam")

# Step 2: Add tactic labels to the sequence data
tactic_data = prepared_data.sequence_data.copy()
tactic_data["Tactic"] = [f"Tactic {c}" for c in clust.assignments]

# Step 3: Build a TNA model for each tactic
tactic_model = tna.group_tna(tactic_data, group="Tactic")
print(tactic_model)
print()
tactic_model.summary()
GroupTNA with 3 groups:
  Tactic 1: 9 states, 78 edges
  Tactic 2: 9 states, 78 edges
  Tactic 3: 9 states, 76 edges

Out[50]:
n_states type scaling n_edges density mean_weight max_weight has_self_loops
group
Tactic 1 9 relative [] 78 0.962963 0.115385 0.476440 True
Tactic 2 9 relative [] 78 0.962963 0.115385 0.492308 True
Tactic 3 9 relative [] 76 0.938272 0.118421 0.542373 True
In [51]:
# Compare transition networks across tactics
tna.plot_network(tactic_model, minimum=0.05, cut=0.1)
plt.show()
No description has been provided for this image
In [52]:
# Centralities per tactic — which states are central in each behavioral strategy?
tactic_cent = tna.centralities(tactic_model, measures=["OutStrength", "InStrength", "Betweenness"])
tactic_cent
Out[52]:
group OutStrength InStrength Betweenness
adapt Tactic 1 1.000000 0.398774 1.0
cohesion Tactic 1 0.979058 0.893311 0.0
consensus Tactic 1 0.917638 2.569952 30.0
coregulate Tactic 1 0.979112 0.572448 0.0
discuss Tactic 1 0.802787 1.273517 16.0
emotion Tactic 1 0.928025 0.991482 5.0
monitor Tactic 1 0.971989 0.362192 0.0
plan Tactic 1 0.650228 0.974802 9.0
synthesis Tactic 1 1.000000 0.192359 7.0
adapt Tactic 2 1.000000 0.330303 1.0
cohesion Tactic 2 0.966667 0.817223 0.0
consensus Tactic 2 0.929478 2.590660 30.0
coregulate Tactic 2 0.978070 0.582732 0.0
discuss Tactic 2 0.794239 1.160634 16.0
emotion Tactic 2 0.913286 0.862658 5.0
monitor Tactic 2 0.985600 0.330461 0.0
plan Tactic 2 0.606618 1.320666 11.0
synthesis Tactic 2 1.000000 0.178620 7.0
adapt Tactic 3 1.000000 0.326030 1.0
cohesion Tactic 3 0.977486 0.746947 0.0
consensus Tactic 3 0.903445 2.831315 30.0
coregulate Tactic 3 0.973333 0.542749 6.0
discuss Tactic 3 0.820805 1.175419 13.0
emotion Tactic 3 0.933118 0.877462 5.0
monitor Tactic 3 0.984479 0.354562 0.0
plan Tactic 3 0.640580 1.167881 3.0
synthesis Tactic 3 1.000000 0.210879 7.0

14. Complete Workflow at a Glance¶

The following code summarizes the full TNA analysis pipeline. This can serve as a template for your own analyses:

import tna
import pandas as pd

# 1. Load and prepare data
my_data = pd.read_csv("your_data.csv")
prepared = tna.prepare_data(my_data, action="event", actor="user_id", time="timestamp")

# 2. Build model
model = tna.tna(prepared)

# 3. Visualize
tna.plot_network(model, minimum=0.05, cut=0.1)
tna.plot_histogram(model)
tna.plot_frequencies(model)

# 4. Prune
pruned = tna.prune(model, threshold=0.05)
tna.plot_network(pruned, cut=0.1)

# 5. Cliques
print(tna.cliques(model, size=2, threshold=0.1))
print(tna.cliques(model, size=3, threshold=0.05))

# 6. Centralities
tna.plot_centralities(tna.centralities(model))
tna.plot_network(tna.betweenness_network(model), cut=0.1)

# 7. Communities
tna.plot_communities(tna.communities(model), cut=0.1)

# 8. Bootstrap
boot = tna.bootstrap_tna(model, iter=1000, level=0.05, seed=265)
tna.plot_network(boot.model, cut=0.1)

# 9. Sequences
tna.plot_sequences(prepared)

# 10. Group models (from metadata column)
gm = tna.group_tna(prepared, group="achievement")
tna.plot_network(gm)                    # side-by-side networks
tna.centralities(gm)                    # centralities with group column
tna.prune(gm, threshold=0.05)           # prune each group
tna.permutation_test(gm["A"], gm["B"])  # compare two groups
tna.compare_sequences(gm)               # compare sequence patterns
tna.compare_sequences(gm, test=True)    # with permutation test

# 11. Sequence clustering (tactics)
clust = tna.cluster_sequences(prepared, k=3, dissimilarity="hamming")
tactic_data = prepared.sequence_data.copy()
tactic_data["Tactic"] = [f"Tactic {c}" for c in clust.assignments]
tactic_model = tna.group_tna(tactic_data, group="Tactic")
tna.plot_network(tactic_model)          # per-tactic networks

# 12. Import one-hot encoded data
wide = tna.import_onehot(onehot_df, cols=["State_A", "State_B"], actor="id")
model_oh = tna.tna(wide)

For more information, see the TNA package documentation and the R TNA tutorial by Saqr & Lopez-Pernas.