The Accountability Horizon: An Impossibility Theorem for Governing Human-Agent Collectives
Haileleol Tibebu
Abstract
Existing accountability frameworks for AI systems, legal, ethical, and regulatory, rest on a shared assumption: for any consequential outcome, at least one identifiable person had enough involvement and foresight to bear meaningful responsibility. This paper proves that agentic AI systems violate this assumption not as an engineering limitation but as a mathematical necessity once autonomy exceeds a computable threshold. We introduce Human-Agent Collectives, a formalisation of joint human-AI systems where agents are modelled as state-policy tuples within a shared structural causal model. Autonomy is characterised through a four-dimensional information-theoretic profile (epistemic, executive, evaluative, social); collective behaviour through interaction graphs and joint action spaces. We axiomatise legitimate accountability through four minimal properties: Attributability (responsibility requires causal contribution), Foreseeability Bound (responsibility cannot exceed predictive capacity), Non-Vacuity (at least one agent bears non-trivial responsibility), and Completeness (all responsibility must be fully allocated). Our central result, the Accountability Incompleteness Theorem, proves that for any collective whose compound autonomy exceeds the Accountability Horizon and whose interaction graph contains a human-AI feedback cycle, no framework can satisfy all four properties simultaneously. The impossibility is structural: transparency, audits, and oversight cannot resolve it without reducing autonomy. Below the threshold, legitimate frameworks exist, establishing a sharp phase transition. Experiments on 3,000 synthetic collectives confirm all predictions with zero violations. This is the first impossibility result in AI governance, establishing a formal boundary below which current paradigms remain valid and above which distributed accountability mechanisms become necessary.
AI Impact Assessments
(3 models)Scientific Impact Assessment: "The Accountability Horizon: An Impossibility Theorem for Governing Human-Agent Collectives"
1. Core Contribution
The paper claims to prove an impossibility result for AI governance: when compound autonomy in human-AI collectives exceeds a computable threshold (the "Accountability Horizon") and the interaction graph contains mixed human-AI feedback cycles, no accountability framework can simultaneously satisfy four axioms—Attributability, Foreseeability Bound, Non-Vacuity, and Completeness. The formalization introduces Human-Agent Collectives (HACs) as tuples within structural causal models, with autonomy characterized along four information-theoretic dimensions (epistemic, executive, evaluative, social).
The paper positions this as analogous to Arrow's Impossibility Theorem for social choice or the CAP theorem for distributed systems—a fundamental boundary result that tells governance designers what is structurally impossible rather than merely difficult.
2. Methodological Rigor
The formal architecture is carefully constructed, with clear definitions building from agents and environments through autonomy profiles to the collective SCM. The proof structure follows a legible path: Lemma 1 (epistemic dilution in cycles), Lemma 3 (causal non-additivity), and Lemma 4 (autonomy-accountability bound) lead to the main theorem.
However, several concerns arise:
The theorem's strength is less than advertised. The result is heavily conditioned on specific modeling assumptions—the mixture-model structure (Assumption 1), contraction (Assumption 3), and faithfulness (Assumption 2). While the paper discusses robustness (δ-perturbation, information-geometric generalization), the impossibility holds within a particular formal model, not with the universality of Arrow's theorem. The paper acknowledges this but still draws the analogy prominently.
The axiom system invites scrutiny. The Completeness axiom (responsibility shares must sum to exactly 1) is the load-bearing axiom driving the impossibility. The paper grounds it in legal precedent, but one could argue this axiom is the most debatable—many real governance frameworks tolerate residual uncertainty or shared liability structures that don't require exhaustive individual decomposition. The paper's own discussion of relaxation to γ < 1 (which merely shifts the threshold) and coalition-based approaches (which dissolve the impossibility) suggests the result is more about the limitations of *individual-locus* accountability than about accountability per se.
The "sharp phase transition" is an artifact of the model's discreteness. The threshold Λ̂* = 1 - 1/|C_min| follows algebraically from the axiom structure—it is where the epistemic budget exactly equals 1. This is mathematically clean but the sharpness depends on the axioms being stated as hard constraints rather than soft desiderata.
The computational experiments validate internal consistency, not external validity. The 3,000 synthetic HACs confirm that the implementation matches the analytical predictions—which is expected since both encode the same mathematical structure. No empirical measurement of autonomy profiles from real systems is attempted.
3. Potential Impact
Positive contributions: The formalization of HACs provides useful vocabulary and structure for reasoning about human-AI systems. The identification of interaction topology (especially mixed feedback cycles) as a critical governance-relevant property is genuinely insightful. The Governance Trilemma (Corollary 7) offers a useful strategic framework for practitioners.
For AI governance policy: The paper could influence how regulators think about structural preconditions for accountability, particularly the insight that feedforward architectures are always governable while feedback architectures may not be. The computable threshold provides a concrete design constraint.
Limitations on impact: The gap between the formal model and real deployed systems is substantial. Estimating autonomy profiles from production systems—acknowledged as future work—is the crucial bridge. Without empirical validation, the framework risks remaining a theoretical exercise. The mixture-model assumption, while covering many current architectures, may not extend cleanly to future systems.
4. Timeliness & Relevance
The paper is well-timed. Agentic AI systems are being rapidly deployed, and governance frameworks (Singapore's, EU AI Act, WEF Presidio) are actively being developed. The observation that these frameworks implicitly assume the "Localisability Assumption" is valuable. However, the practical governance community may find the formal apparatus inaccessible, and the constructive alternatives (distributed accountability calculus) are deferred to future work—limiting immediate applicability.
5. Strengths & Limitations
Key Strengths:
Notable Weaknesses:
6. Overall Assessment
This is an ambitious paper that attempts something genuinely valuable: bringing formal rigor to AI governance. The mathematical framework is competently constructed and the core insight about topology-dependence of governability is real. However, the impossibility result's practical significance is tempered by its dependence on individual-locus accountability axioms and the absence of empirical grounding. The paper would benefit from being more measured in its claims about the universality and significance of the result.
Generated Apr 10, 2026
Comparison History (90)
Paper 1 likely has higher scientific impact due to methodological rigor, clear novelty, and direct real-world applicability. It proposes a unified, physically grounded sampling framework bridging diffusion generative models and random structure search, and reports substantial efficiency gains plus out-of-distribution effectiveness—highly timely for materials and molecular discovery with broad downstream impact in chemistry, materials science, and ML. Paper 2 is conceptually ambitious, but its claims (computable autonomy threshold, phase transition, “experiments” on synthetic collectives) are harder to validate and translate into actionable governance mechanisms, making near-term impact more uncertain.
Paper 1 has higher likely scientific impact due to strong methodological rigor and immediate, broad real-world applicability in materials/molecular discovery. It proposes a novel, physically grounded unification of diffusion generative models and random structure search, with clear empirical gains (order-of-magnitude sampling efficiency) and demonstrated out-of-distribution effectiveness—key for adoption in computational chemistry/materials. Paper 2 is conceptually novel and timely, but its impact depends heavily on the validity and acceptance of its formalization/axioms; the “impossibility” claim may be more contestable and less directly actionable experimentally.
Paper 2 introduces a fundamental mathematical impossibility theorem for AI governance, establishing a theoretical limit on accountability in autonomous systems. This interdisciplinary contribution spans AI, law, and ethics, offering a significant paradigm shift. In contrast, Paper 1 presents a solid but more incremental algorithmic improvement for VLM exploration, giving Paper 2 a broader and more enduring scientific impact.
Paper 1 is more likely to yield broad, near-term scientific impact: it introduces a large, realistic benchmark with substantial scale, task diversity, and concrete evaluation methodology that can be adopted by many labs to drive measurable progress in agentic systems and workspace automation. Its applications span software engineering, enterprise tooling, IR/retrieval, and agent evaluation, and it is timely given rapid growth in tool-using agents. Paper 2 is conceptually novel, but its impact depends on acceptance of strong formal assumptions and synthetic validation, with less immediate empirical leverage for the wider ML community.
Paper 2 has higher potential impact because it proposes a foundational impossibility theorem for AI governance with clear, broadly applicable formalism and a phase-transition-style boundary condition—concepts that could reshape legal/ethical/regulatory thinking across fields. Its novelty and breadth (spanning causality, information theory, multi-agent systems, governance) are high and timely given agentic AI deployment. Paper 1 is methodologically solid and useful, but mainly advances evaluation infrastructure within ML/agent benchmarking, likely yielding more incremental, domain-specific impact.
Paper 1 addresses a core challenge in LLM post-training—improving exploration and reasoning diversity—with a concrete, implementable framework (Poly-EPO) backed by empirical results on reasoning benchmarks. This directly advances the rapidly growing field of LLM reasoning and test-time compute scaling, with immediate practical applications. Paper 2 presents an interesting theoretical impossibility result for AI governance, but its impact is limited by reliance on synthetic experiments, strong assumptions that may not map cleanly to real-world governance, and a more niche audience. Paper 1's methodology is more likely to be widely adopted and built upon.
Paper 1 addresses a core challenge in LLM post-training—improving exploration and reasoning diversity—with a practical algorithmic framework (Poly-EPO) demonstrating empirical gains across benchmarks. This directly impacts the rapidly growing field of reasoning model training and test-time compute scaling. Paper 2 presents an interesting theoretical impossibility result for AI governance, but its impact is limited by reliance on synthetic experiments, strong assumptions in its axiomatization, and the gap between formal proofs and real governance practice. Paper 1's methodological contributions are more immediately actionable and broadly applicable to the ML community.
Paper 1 introduces a foundational impossibility theorem for AI governance, fundamentally challenging existing legal and ethical frameworks for autonomous systems. Its mathematical formalization of the 'Accountability Horizon' provides a broad, cross-disciplinary impact spanning computer science, law, and policy. While Paper 2 offers a valuable methodological advancement in medical imaging and Learning to Defer, Paper 1's theoretical boundary establishes a new paradigm for how society regulates and interacts with highly autonomous AI, yielding a higher potential scientific impact.
Paper 1 introduces a fundamental impossibility theorem for AI accountability, establishing a mathematical boundary for AI governance. This profound theoretical contribution has sweeping implications across AI safety, law, ethics, and regulation. Paper 2 offers a valuable but specialized methodological improvement for medical imaging ML, giving Paper 1 a much broader and more transformative potential scientific impact.
Paper 2 is more likely to have higher scientific impact due to immediate real-world applicability and methodological tangibility: it delivers a reusable human-in-the-loop system, a released codebase/UI, and a high-quality dataset enabling benchmarking and downstream research in LLMs, formal methods, and scientific computing. Its contributions (pipeline, dataset, evaluation, semantic-drift taxonomy) are actionable and broadly useful across domains. Paper 1 is conceptually novel, but its impact depends heavily on acceptance of new formal definitions/thresholds and synthetic validation, with less clear near-term uptake beyond governance theory.
Paper 2 establishes a foundational impossibility theorem in AI governance, mathematically proving the limits of current accountability frameworks for highly autonomous systems. This cross-disciplinary result has profound implications for AI ethics, law, and multi-agent systems. While Paper 1 offers a valuable tool and dataset for autoformalization in Lean, Paper 2's theoretical breakthrough provides broader, paradigm-shifting impact across multiple fields dealing with AI regulation, safety, and deployment.
Paper 2 introduces a fundamental mathematical impossibility theorem for AI governance, bridging computer science, ethics, law, and complex systems. Establishing a formal boundary for accountability has profound, paradigm-shifting implications across multiple disciplines. While Paper 1 provides valuable empirical contributions and a new benchmark for AI alignment, Paper 2's theoretical depth and potential to reshape the foundational assumptions of AI regulation give it a much higher ceiling for long-term scientific and societal impact.
Paper 1 addresses specification gaming in reasoning models with empirical rigor, an open-source evaluation suite, and actionable findings directly relevant to AI safety as RL-trained reasoning models proliferate. Its systematic study of what drives specification gaming (RL training, reasoning budget) provides immediately useful insights for the research community. Paper 2, while intellectually ambitious with its impossibility theorem for AI accountability, relies on synthetic experiments and abstract formalizations that may have limited practical uptake. Paper 1's empirical grounding, timeliness given the rapid deployment of reasoning models, and open-source contributions give it broader near-term impact.
Paper 1 offers a highly novel, foundational contribution: a formal impossibility theorem for accountability in human–AI collectives, with clear axioms and a phase-transition-style boundary (the “Accountability Horizon”). If valid, it would reframe AI governance by proving limits of audits/transparency and motivating new distributed accountability regimes—broad impact across AI safety, law, ethics, causality, and multi-agent systems, and highly timely given agentic AI. Paper 2 is plausible and useful for improving LLM reasoning, but is more incremental relative to active lines (meta-reasoning, memory/consolidation) and likely narrower in cross-field impact.
Paper 2 has higher estimated impact due to its conceptual novelty (an impossibility theorem/phase transition for accountability), broad cross-field relevance (AI governance, law, ethics, causality, information theory, multi-agent systems), and strong real-world implications for regulation and oversight of agentic AI. If the theorem and autonomy threshold are correct and general, it would reframe accountability discourse more fundamentally than Paper 1’s (valuable) self-play reinforcement improvements, which are impactful but mainly within ML training methodology and may face faster incremental displacement.
Paper 2 introduces a fundamental impossibility theorem for AI governance, bridging formal mathematics, causal modeling, and AI ethics. While Paper 1 offers a valuable empirical framework for improving LLM reasoning, Paper 2 establishes a theoretical boundary with profound implications across computer science, law, and policy. Impossibility theorems historically reshape their fields by defining structural limits. Because of its cross-disciplinary breadth, high novelty, and potential to shift the entire paradigm of AI regulation and multi-agent system design, Paper 2 has a significantly higher potential for long-lasting scientific impact.
Paper 2 establishes a formal impossibility theorem for AI accountability, a foundational theoretical contribution. While Paper 1 offers a highly practical framework for evaluating content moderation, Paper 2's mathematical proof of an 'Accountability Horizon' has far-reaching implications across AI safety, law, philosophy, and governance. Foundational impossibility results traditionally shape the long-term trajectory of their respective fields, giving Paper 2 a broader and more profound potential scientific impact.
Paper 1 introduces a fundamental impossibility theorem for AI governance, analogous to Arrow's theorem but for AI accountability. Its rigorous mathematical proof that highly autonomous AI systems fundamentally break existing accountability frameworks has profound implications across AI safety, law, ethics, and policy. While Paper 2 offers strong theoretical advancements in Explainable AI, Paper 1's cross-disciplinary breadth, timeliness regarding autonomous agents, and potential to reshape global AI regulatory paradigms give it a significantly higher potential for broad scientific and societal impact.
Paper 1 introduces a fundamental impossibility theorem for AI governance—the first of its kind—establishing a mathematically necessary boundary on accountability in human-AI systems. This has broad implications across AI policy, law, ethics, and regulation, making it highly timely given the rapid deployment of agentic AI. Its conceptual contribution (a sharp phase transition in governability) is likely to reshape discourse across multiple fields. Paper 2, while technically rigorous with machine-checked proofs, addresses a narrower formal verification question about workflow governance that, though valuable, impacts a more specialized audience.
Paper 1 introduces a fundamentally new impossibility theorem for AI governance—the first formal proof that accountability frameworks face inherent mathematical limits as AI autonomy increases. This has profound implications across AI policy, law, ethics, and computer science, establishing a theoretical boundary akin to Arrow's impossibility theorem for social choice. While Paper 2 makes a strong technical contribution to AI-generated text detection with impressive empirical gains, it addresses a narrower problem within an existing research paradigm. Paper 1's breadth of cross-disciplinary impact and foundational nature give it higher potential scientific impact.