ADR-0042: Unsatisfiable-regime domain choice is sticky at equal coverage — switch only for strictly greater
Status
Accepted, 2026-06-11.
Context
ADR-0041 closed the sub-machine-gang cascade; the residual
(bigfleet-uber #55) is genuinely multi-machine GPU gangs — 2–256
whole a3-highgpu-8g nodes, rack-coherent — that no single rack can
host. The intended behaviour for those is the ADR-0040 Addendum’s
concentrate-then-park: assemble as much of the gang as the best
domain allows, hold the rest pending, age in the shortfall buffer
(which escalates — the paper’s answer to unsatisfiable demand), and
go quiet.
The diagnostic (#56) proved they never go quiet, and named the mechanism. Shortfall recording is a passive report — nothing consumes it to suppress re-attempts — and every cycle Phase 1 re-derives the gang’s joint domain choice from scratch. In the unsatisfiable regime that choice has no incumbent preference: ADR-0041’s sticky-domain rider was deliberately confined to satisfiable buckets so most-covering could keep concentrating gangs toward genuinely bigger domains. At uber-5k the GPU racks are dozens of identical-total buckets, so most-covering ties constantly, and 20 clusters’ sequential claim walks perturb which blocks look acquirable to whom — the tie-break (count, then lexicographic value) resolves differently cycle to cycle. The gang abandons its partial assembly, acquires toward a different identical rack (scattered supply guarantees the attempt is never empty), and Phase 3 reclaims the stranded machines: both halves of the measured ~27/sec Bootstrap↔Reclaim churn, sustained forever by a static set of ~190 unsatisfiable gangs.
The closed-loop sim’s park tests pass because their shapes reach zero-acquisition and stop; the cloud’s contended, scattered supply keeps acquisition narrowly non-zero. The missing piece is not a new suppression state — it is that the domain choice has no memory exactly where memory is what parking means.
Decision
In ChooseSameBucket’s unsatisfiable regime, a Need switches
domains only for strictly greater coverage. Among buckets of
equal (capped) coverage, one containing the Need’s creditable supply
— the domain where its concentrated partial assembly already lives —
wins before the count and lexicographic tie-breaks.
Stateless, like every rule in the chooser: the incumbent signal is
CreditableCount > 0, already present in the joint fold (the gang’s
previously-concentrated machines are its cluster’s Configured,
unclaimed when its priority turn arrives). Phase 3 consumes the same
chooser (ADR-0041), so keep/reclaim parity is automatic.
What this preserves, deliberately:
- Most-covering still wins when it is strictly better. A gang holding 2 machines on rack A still moves to an empty 3-machine rack B — concentration toward genuinely bigger domains is the Addendum’s point, and the no-oscillation shapes depend on it.
- Priority remains the sole throttling mechanism (bigfleet.md §16). No aging thresholds, no re-probe cadence, no per-Need suppression state. The Need keeps participating in every cycle at its priority; it simply stops flip-flopping between equivalent domains, which makes in-domain acquirables exhaust and acquisition reach zero naturally.
Predicted consequence, to validate: with the domain pinned at ties, the gang concentrates once, in-domain acquirable supply exhausts, Phase 1’s residual goes quiet (no acquisition), Phase 3 keeps the credited assembly (parity), and the Need ages in the shortfall buffer as designed — both churn halves die without new machinery.
Escalation path
If the contention canary or the cloud re-run still shows residual churn (e.g. coverage totals genuinely fluctuating rather than tying), the author-approved fallback is explicit suppression: Needs classified structurally unsatisfiable on the snapshot and aged ≥K cycles stop driving acquisition, with a defined re-probe cadence. That variant costs per-Need state and two tunables; it is not built until the stateless rule is shown insufficient.
Consequences
ChooseSameBucket’s documented total order gains one clause: satisfiable > creditable-among-satisfiable (ADR-0041 rider) > smallest-satisfiable-total / most-covering > creditable-at-equal-coverage (this ADR) > count > value.- Acceptance is cloud-decided, sim-pinned — an honest deviation from
the usual simulator-first pattern. The multi-cluster contention
canary (
TestClosedLoop_MultiClusterGangContention_ParksQuiet) pins the parking contract but does NOT discriminate pre/post-rule: the deterministic sim resolves the count/value tie-break identically every cycle, so it cannot express the perturbation that flips the choice in the cloud (snapshot deltas from in-flight transitions and contending clusters). The discriminator is one mechanism-validation cloud run carrying the per-gang probe (phasedump-gated): per cycle per churning group, the chosen domain and residual. Expected post-fix: chosen domains stop flipping, Bootstrap/Reclaim → ≈0 post-fill whilep1_unsatisfiedlegitimately stays at the unsatisfiable-gang count. If domains flip on strictly-fluctuating coverage rather than ties, this rule is insufficient and the escalation path engages. - The per-rack-capacity question (gangs of 64–256 nodes vs racks of ~52) is untouched: those classes park instead of churning, and whether the catalog should model them at rack scope at all is the separate harness-realism follow-up.