Sequence Patterns, Outcomes, and Indices with codyna

A Tutorial on dynamics of behavior with codyna

tutorial
R
Authors
Affiliation

University of Eastern Finland

University of Eastern Finland

Published

February 14, 2026

1 Exploring Sequences

Learning unfolds as an ordered sequence of states—engagement levels across weeks, problem-solving actions within a session, regulatory behaviors during collaboration. Sequence analysis preserves this temporal ordering rather than collapsing it into a single summary. The approach originated in molecular biology for comparing DNA sequences and was adapted for the social sciences by Abbott (1995). In education, it has been used to study course-taking trajectories, self-regulated learning strategies, and collaborative regulation dynamics (Saqr et al., 2024a).

From a complex dynamic systems perspective, learners exhibit feedback loops, attractor states, and phase transitions (Saqr et al., 2025a). A student’s trajectory is not a random walk—disengagement breeds more disengagement, and current states affect next ones as well as sudden shifts from one stable regime to another are common. Sequence-level analysis captures these dynamics at the individual level, complementing the aggregate view provided by transition models.

TipCompanion tutorials

This tutorial is part of a series on tutorials on dynamics of learning and learners using Transition Network Analysis with the tna and codyna R packages:

  1. Transition Network Analysis with R — building, visualizing, and interpreting TNA models; centrality, communities, bootstrapping.
  2. TNA Group Analysis — analysis and comparison of groups.
  3. TNA Clustering — discovering and analyzing clusters of sequences.
  4. TNA Model Comparison — edge-level, summary, centrality, and network-level comparison; permutation tests.
  5. Sequence Patterns, Outcomes, and Indices (this tutorial) — pattern discovery, outcome modeling, and structural indices with codyna.

Package website: https://sonsoles.me/tna/

Sequence analysis is a family of methods for studying ordered categorical data. In education, the “sequence” is typically a student’s trajectory through a series of states measured at successive time points.

Three levels of analysis are possible:

  1. Sequence visualization: plotting individual trajectories and state distributions to see the raw data before modeling.
  2. Sequence indices: computing per-sequence summary measures (entropy, stability, complexity) that characterize how each sequence unfolds.
  3. Pattern discovery: identifying recurring sub-sequences (n-grams, gapped patterns) that tell us what specific pathways students follow.

This tutorial covers all three. For a comprehensive introduction to sequence analysis in education, see Saqr et al. (2024a). For transition-based approaches, see López-Pernas et al. (2024a) on Markov models and Saqr et al. (2025b) on Transition Network Analysis.

This tutorial works with three datasets, each illustrating different scenarios of sequence analysis. Before analyzing any dataset, we visualize it—distribution plots and frequency plots establish the context that makes pattern and index results interpretable.

Table 1: Datasets used in this tutorial
Dataset Source Sequences States Time points Used for
group_regulation tna package 2,000 9 (collaborative regulation) up to 26 N-gram examples with filtering
Codyna Codyna.RDS 5,000 10 (math exercise actions) up to 10 Pattern discovery, outcome modeling
engagement codyna package 1,000 3 (Active, Average, Disengaged) 25 Sequence indices

1.1 Visualizing the regulation data

The first step with any dataset is to visualize it. We begin with the group_regulation dataset—2,000 collaborative regulation sequences with 9 states. TNA provides tools for preparing and visualizing event data. We will use TNA here to prepare the data but we will focus on the sequences not the network analysis.

data("group_regulation_long", package = "tna")
prepared <- prepare_data(
  group_regulation_long,
  action = "Action", actor = "Actor", time = "Time"
)

Having prepared the data we can explore it with sequence analysis and visualize the sequence. TNA comes with several methods for plotting and incldueds sequence plots. Here we will use distribution plots. A state distribution plot aggregates individual trajectories into proportions at each time point:

plot_sequences(prepared, type = "distribution", scale = "proportion")