Calibrating an Agent-Based Model of School Choice on Register Data

dignum_2025_school_choice.md · 1,773 words · 7 min read


Contents

Calibrating an Agent-Based Model of School Choice on Register Data

Dignum, Boterman, Flache, and Lees (2025) estimate a Dutch school-choice ABM directly against household-level CBS records using neural ratio estimation, and find that fast-and-frugal heuristic households fit the data better than rational-choice ones — with consequences for how we think about segregation under "free choice" policies.

The research question

Most agent-based models in social science live in a frustrating gap: agent rules are hand-picked or imported from a separate discrete-choice regression, and the model is then judged on whether some macro statistic looks roughly right. The agents themselves are rarely fit to the actual decisions of actual people. Dignum, Boterman, Flache, and Lees (2025), writing in JASSS 28(4) 8, push directly on this gap. Their question is hard to answer in most settings: can we estimate the parameters of an ABM from individual-level register data, and if so, what do the recovered parameters tell us about how households actually choose schools?

The Dutch context is what makes the project possible. Statistics Netherlands (CBS) maintains household-level register data covering the entire population of primary-school children in cities like Amsterdam (191 schools, tens of thousands of pupils across four educational/migration-background groups) and Almere. For each child the register records residential address, school attended, and a battery of household characteristics. The Netherlands is also one of the most "free-choice" school systems in Europe — Amsterdam grants priority at the eight nearest schools, but parents are not assigned a neighborhood school and can opt into any school with capacity. That combination of universal individual-level data plus genuine demand-side choice is unusual. The same exercise in the United States would collide with FERPA-protected enrollment records and a patchwork of charter, magnet, voucher, and zoned-attendance regimes that would not fit a single behavioral model anyway.

The authors extend an earlier stylized school-choice ABM (Dignum et al. 2024) to two full Dutch cities and ask whether the parameters governing household behavior — distance sensitivity, preferences over school size, sensitivity to ethnic and socioeconomic composition, sensitivity to school quality — can be recovered directly from the register, rather than estimated separately and bolted on.

Method: neural ratio estimation

The technical obstacle is that ABMs typically have no tractable likelihood. You can simulate them — given a parameter vector, the model produces a sample of agent choices — but you cannot generally write down the probability of those choices in closed form. Maximum likelihood and Markov chain Monte Carlo, which dominate empirical economics, both require that closed-form likelihood and so are unavailable.

Simulation-based inference (SBI) is the family of methods designed for this case. The oldest member is Approximate Bayesian Computation; newer members lean on neural networks. The two most prominent are neural posterior estimation (NPE) and neural ratio estimation (NRE), which Dignum and colleagues use here.

NRE reframes the inference problem as classification. The procedure is:

  1. Sample many candidate parameter vectors from a prior.
  2. Run the ABM under each one and compute summary statistics over the simulated population.
  3. Train a neural classifier to distinguish "matched" pairs of (parameters, summary statistics) — the parameters that actually generated those statistics — from "mismatched" pairs.
  4. Use the trained classifier as a likelihood-ratio estimator: given the real summary statistics from CBS, the network tells you which parameter vectors are most consistent with the data.

The appeal over fixed-point iteration or generalized method of moments is that the network learns the geometry of the parameter-to-summary mapping nonparametrically. ABMs have non-linear feedback (school composition depends on who chose, which depends on composition), and a neural classifier handles those couplings without the modeler specifying them. Dignum et al. cite Hermans et al. (2020) for the NRE machinery and Cranmer, Brehmer, and Louppe (2020) for the broader SBI landscape.

A key methodological move is that the authors first verify NRE on synthetic ground truth — generate data from a known parameter vector, then check that NRE recovers it — before turning the machine loose on the real CBS register. Previous SBI demonstrations were on smaller ABMs; it was not obvious a priori that the technique would survive the jump to a city-sized model.

Method: rational vs heuristic households

The substantive contribution sits one level up from the inference machinery. Dignum and colleagues do not just fit one model — they fit two competing models of household decision-making, and then ask which one fits the register better.

The rational household is a multinomial-logit chooser. It computes a utility for each feasible school as a weighted sum over (negative distance, school size, composition match with the household's own group, and school quality), then picks proportionally to the exponential of that utility. This is the standard discrete-choice machinery used in transportation, marketing, and most of the empirical school-choice literature (e.g., Mutgan 2021 for Stockholm).

The heuristic household, by contrast, is built in the tradition of Gigerenzer and the bounded-rationality program (Gigerenzer and Selten 2002; Gigerenzer and Gaissmaier 2011). Rather than aggregating attributes into a scalar utility, a fast-and-frugal heuristic processes attributes lexicographically: pick the closest school within some distance cutoff, break ties on composition, fall back to the nearest school with capacity if none qualifies. The household never trades distance against quality; it satisfices on distance first and only consults the next attribute when the first does not decide.

Feature Rational MNL household Heuristic household
Combines attributes Weighted sum into scalar utility Lexicographic / one-at-a-time
Trades distance vs quality Yes, at marginal rate $\beta_q / \beta_d$ No — distance is a hard filter
Choice rule Softmax (proportional-to-utility) Best within filter, deterministic up to noise
Cognitive demand Compute utility for every school Compute distance only, then maybe one tiebreak
Origin McFadden (1974), discrete-choice econ Gigerenzer's "Adaptive Toolbox"

The two models are not nested. NRE handles that comfortably — the inference does not require nesting — and the comparison reduces to which parameterization yields summary statistics closer to the observed CBS register.

Headline results

The headline finding is that the heuristic decision rule fits the Dutch register data better than the rational MNL. Households do not appear to be silently maximizing a weighted utility over distance, size, composition, and quality. They appear to be filtering on distance and then looking at composition, in a way the lexicographic heuristic captures and the MNL does not.

That result has direct implications for school-segregation policy. Most simulation-based and discrete-choice analyses of "what happens if we change the assignment mechanism" assume an MNL-shaped household. If households are actually heuristic, then changes to the choice set — adding a school, adjusting catchment boundaries — have non-smooth effects, because they change which schools cross the distance threshold rather than shifting marginal utilities. Dignum and colleagues note that even tolerant heuristic households can produce segregation outcomes that look as stark as those of less tolerant rational ones, because cutoff geometry interacts with residential segregation in ways smooth utility maximization does not capture.

The authors are appropriately careful: parts of the register are not closely fit even by the heuristic model, suggesting more heterogeneity is missing. But the qualitative ranking — heuristic ahead of rational — is robust.

What college-sim already does

Our college-sim engine sits squarely in the rational-MNL camp on the student-decision side. The function studentFinalDecisions() (sim.js:4919) walks each student's set of acceptances and scores every offer as a weighted sum: prestige (a normalized cross-admit Elo with archetype-specific bonuses), archetype-fit, a legacy bonus when applicable, uniform personal noise, a Chetty-style income-yield term, and an in-state preference. The student picks the highest-scoring acceptance. This is exactly an MNL-shaped utility, just with our specific feature vector and weights.

Our calibration target is also different. Where Dignum and colleagues fit against household-level register data, college-sim's initColleges() (sim.js:3260) loads per-college threshold deltas from a fixed-point iteration that targets aggregate Common Data Set acceptance rates, not individual-student choice records. We do not have a CBS-equivalent for US college admissions: student-level enrollment data is fragmented and protected, and the closest analog (the NSC StudentTracker file or institutional-research extracts) is not a public register.

There is a partial mitigation in generateStudents() (sim.js:2237). Students are bucketed into six behavioral archetypes — stem_spike, humanities_spike, arts_spike, athletic_spike, well_rounded, average_academic — and each archetype has its own preference profile and feature distribution. That architecture pre-encodes a coarse form of heuristic heterogeneity at the population level: an arts_spike student values a fit-to-arts college very differently from how a well_rounded student does, even though within an archetype the final decision is MNL-shaped. It is not the same as a Gigerenzer-style lexicographic household, but it is a step away from a single homogeneous utility function.

What Dignum does that we don't (yet)

Two extensions are immediately suggested.

The first is methodological: NRE-style simulation-based inference is plausible for college-sim if and when student-level decision data becomes available. Right now we calibrate to CDS aggregates by fixed-point iteration, which is fast and stable but discards information about which students chose which colleges. If we were to acquire even an anonymized panel of (student profile, application list, acceptance set, enrollment) tuples — for example through a partnership with a counseling network — NRE would let us fit hook multipliers, prestige weights, and noise scales jointly against that micro-level register, rather than tuning each in isolation against a marginal target.

The second is substantive: the rational-versus-heuristic model selection is potentially testable on swipe.html user-interaction logs once we collect enough of them. A swipe interaction is, behaviorally, much closer to a lexicographic filter than to a softmax over scalar utility — users glance at one or two salient attributes and decide. If we record swipe sequences alongside the user's profile, we can fit both an MNL-shaped utility model and a lexicographic-heuristic model to the swipe register and run the same model-comparison machinery Dignum and colleagues use. That is a research-grade extension rather than a product feature, and it is sketched in the broader roadmap at cas_abm_extensions_roadmap.html.

Run the reproduction yourself

A simplified Node.js reproduction is in research2/cas-abm-references/reproductions/02-dignum-dutch-school-choice/. It uses 2,000 synthetic households and 10 schools, and replaces NRE with direct log-likelihood maximization, but exercises the same model-selection logic: when the ground truth is heuristic, the heuristic fit wins; when it is rational, MNL wins. See the README in that folder for run instructions.

Citation

Dignum, E., Boterman, W., Flache, A., & Lees, M. (2025). Empirically estimating an agent-based model of school choice on household-level register data. Journal of Artificial Societies and Social Simulation, 28(4), 8. https://www.jasss.org/28/4/8.html. DOI: 10.18564/jasss.5798.