The Price of Fairness, from Kidneys to College Admissions
dickerson_2014_price_of_fairness.md · 1,808 words · 7 min read
Contents
The Price of Fairness, from Kidneys to College Admissions
How a 2014 paper on kidney exchange formalized a question every admissions office quietly answers — what does it cost the bottom line to give somebody else a chance?
The kidney exchange problem
A patient with end-stage renal disease often arrives with a willing live donor — a spouse, a sibling, a friend — whose kidney turns out to be incompatible. Blood type is wrong, or HLA antigens cross-react, and the transplant cannot proceed directly. Kidney paired donation (KPD) solves this by chaining incompatible pairs together. If pair A's donor is compatible with pair B's patient, and pair B's donor is compatible with pair A's patient, the two operations are scheduled simultaneously and both patients receive a kidney. Longer cycles work too, and so do "altruistic chains" started by a non-directed donor with no associated patient.
Formally, KPD is a graph problem. Each pair becomes a vertex. A directed edge from pair A to pair B exists if A's donor is medically compatible with B's patient — a question of ABO blood type, HLA tissue typing, and the patient's panel-reactive antibody score (CPRA). The clearing problem is to find a vertex-disjoint packing of cycles (and altruistic-donor chains) that maximizes some objective. The standard objective is maximum weight matching: choose the cycle packing that maximizes the total number of transplants (or, in weighted versions, the sum of expected graft survival years). The problem is NP-hard in general, and the United Network for Organ Sharing (UNOS) Kidney Paired Donation Pilot Program clears its national pool by solving an integer program every two weeks.
Maximum-weight clearing has an uncomfortable side effect. High-CPRA patients — the most highly sensitized, who carry antibodies against most of the donor population — are systematically underserved. By construction, very few donors are HLA-compatible with them, so they appear as low-degree vertices in the compatibility graph. A max-weight cycle packer correctly notices that picking a cycle through a low-degree vertex blocks fewer alternative cycles when it picks a cycle through a high-degree vertex, and routes around the sensitized patient. The patient was already medically unlucky. The clearing algorithm makes them mechanism-unlucky too.
This is the situation Dickerson, Procaccia, and Sandholm walked into in 2014.
Two fairness criteria
The paper introduces two formal fairness objectives that compete with raw efficiency.
Lexicographic fairness is the strict version. Partition patients into a sensitized set H (e.g. CPRA ≥ 0.8) and an unsensitized set L. A lexicographically fair matching first maximizes the number of H-patients matched, and then — subject to that constraint — maximizes the total number of matches. There is no tradeoff parameter. The sensitized come first, full stop, and any efficiency the algorithm can scrape together for the unsensitized is welcome but secondary. Lexicographic fairness is the kidney-exchange analogue of "no patient left behind."
α-weighted fairness (the paper writes it as a blend in [0, 1]) is the soft version. The clearing objective becomes a convex combination
objective = α · (matches for H) + (1 − α) · (matches for L)
or, in a slight reframing the authors also study, a blend of total matches and a fairness term. At α = 0 the planner is a pure utilitarian and reproduces max-weight clearing. At α = 1 the planner is a pure egalitarian over the sensitized class. Intermediate α traces out a Pareto frontier between total transplants and equity for the most vulnerable patients. The α-fair objective remains an integer program — the change is just to the coefficients on each cycle's contribution — so the same ILP machinery the pilot program already runs can solve it.
The price of fairness
The central object of the paper is the price of fairness (PoF): the relative loss in efficiency, measured as fraction of total matches forgone, when one switches from max-weight clearing to a fairness-constrained clearing. If max-weight produces M* matches and lexicographically fair clearing produces M_LEX, then PoF_LEX = (M* − M_LEX) / M*. A PoF of 0.05 means the fairness-constrained mechanism left 5% of total transplants on the table to redistribute toward sensitized patients.
The paper proves worst-case bounds. In adversarially constructed compatibility graphs the price of fairness can be substantial. But the empirical headline — and the reason the paper is cited far outside of computational mechanism design — is that on realistic graphs, the price of fairness is small. On UNOS Kidney Paired Donation Pilot Program data and on synthetic graphs from the Saidman, sparse, and dense generators that the field uses for benchmarking, the price of fairness is typically less than 5% of total matches. Lexicographic and moderate-α fairness redistributes a meaningful share of transplants from low-CPRA to high-CPRA patients while leaving the total match count almost untouched.
That result is consequential because it removes the strongest argument against fairness-aware clearing. The medical community had assumed that prioritizing the hardest-to-match patients would visibly shrink the program. Dickerson et al. showed it does not — at least not on the graphs that real exchanges actually see — and gave clearing-house operators a calibration knob (α) and a measurable number (PoF) to defend whatever choice they make.
Why this matters for admissions
College admissions has a structural analogue that is hiding in plain sight.
If we treat raw academic index — some weighted combination of GPA and SAT — as the "weight" on each applicant, then the academically max-weight admitted class is the one a pure ranker would pick: the top N by index, full stop. No college actually does that. Every selective institution tilts the matching away from the academic max-weight in service of an institutional fairness or composition objective. The tilts have names. Legacy preferences redistribute seats toward applicants whose parents attended. Recruited-athlete slots redistribute toward students who fill rosters. Donor consideration redistributes toward families connected to development. First-generation flags and Pell-eligible flags redistribute toward students from non-college-going households and low-income brackets. Geographic diversity bonuses redistribute toward states with few applicants. Major balancing, gender balancing, and post-SFFA shifts in race-conscious review redistribute along still other axes. Each one is a fairness term in α-weighted form, with a different sensitized class H and a different α picked by the dean.
The question Dickerson et al. ask of kidney exchange is exactly the question one should ask of admissions: what does each fairness term cost the academic-index objective? If a college turns its legacy multiplier off, how much does the admitted-class mean SAT move? If athlete slots are removed, how much does the admitted-class mean GPA move? Those numbers are price-of-fairness numbers. They have the same shape as the kidney-exchange PoF, and they are the right way to talk about hooks without either defending them as costless or condemning them as ruinous.
What college-sim already does
The simulation has all the machinery wired. The hook multipliers live in a single tier-indexed table, HOOK_MULT_BY_TIER at sim.js:3742, with rows for donor, athlete, and legacy and columns for tiers 1 through 7. Donor multipliers run from 12.0 at HYPSM down to 2.0 at the broadest tier; athletes 15.0 down to 1.5; legacies 5.7 down to 1.3. First-generation targets are encoded separately at sim.js:3733 as FIRST_GEN_TARGET_BY_TIER, soft-target shares ranging from 0.15 at the most selective tiers to 0.20 further down.
These multipliers are consumed by computeAdmissionScore() at sim.js:3873, which builds an admit logit from additive components (academic + EC + essay + fit + feeder + hook + round + DI + yield-penalty + in-state) and pushes the sum through a sigmoid to produce P(admit). Hooks enter as logs of multipliers, exactly the right shape for a Dickerson-style sweep: setting a multiplier to 1.0 zeroes out its log contribution and is the clean ablation.
Critically, this branch is gated. At sim.js:3906 the hook accumulation is wrapped in if (SIM_MODE !== 'personalized'). In personalized mode, hook routing is handled by a separate pool mechanism (the Phase 20 redesign) rather than logit multipliers. That gating means a price-of-fairness diagnostic naturally lives in classic mode — the canonical 20-HS, 55-college calibration setting where each multiplier is one number on one line and ablating it is a one-character edit.
What Dickerson does that we don't (yet)
What the simulation does not yet do is run the ablation. A proper price-of-fairness sweep would, for each hook in turn, vary the strength parameter from its current value down to 1.0 (off) in steps, run a Monte Carlo batch at each setting, and record two outputs:
- The admitted-class mean academic index (and its tier-by-tier breakdown) at each strength. The change between full-strength and zero-strength is the academic price of fairness for that hook.
- The admitted-class composition along the relevant axis — share of legacies admitted, share of recruited athletes, share of first-generation students, share of Pell-eligible — at each strength. This is the redistributive benefit being purchased by paying the price.
The output is a table per hook: rows are strength settings, columns are mean academic index and target-class share. The headline number is what the paper would call the lexicographic price of fairness — efficiency lost when the hook is at full strength versus off. With six hooks across seven tiers, the result is publishable narrative content directly: a price tag, in mean-SAT points or admit-class GPA percentile, for each fairness lever the simulation models. This is one of the three Tier-1 narrative-output extensions in the planning document; for the broader roadmap, see the CAS/ABM extensions roadmap.
The result will not look like the kidney-exchange PoF in absolute magnitude — admissions hooks are larger logit shifts than kidney CPRA mismatches, and the academic index is a continuous quantity rather than a binary match. But the shape of the analysis is the same, and the reason for running it is the same: to replace assertions about hooks with numbers about hooks.
Run the reproduction yourself
A simplified Node.js reproduction lives at /research2/cas-abm-references/reproductions/05-dickerson-kidney-fairness/. It generates a 200-pair synthetic compatibility graph from a toy ABO + CPRA model, packs cycles greedily with random restarts and a one-eject local search rather than CPLEX ILP, and sweeps α through five values. The output is a Markdown table showing total matches, high-CPRA matches, high-CPRA matched fraction, and PoF at each α. It uses synthetic generators rather than UNOS pilot data by design, but it reproduces the qualitative shape: total matches dip by roughly 1% as α rises while the high-CPRA matched fraction creeps up, and the small-PoF headline survives the simplification.
Citation
Dickerson, J. P., Procaccia, A. D., and Sandholm, T. (2014). Price of Fairness in Kidney Exchange. In Proceedings of the 13th International Conference on Autonomous Agents and Multi-Agent Systems (AAMAS), Paris, pp. 1013–1020.