Zero knowledge verification for frontier AI training is possible
Pierre Peigné, Ky Nguyen, Paul Wang
Abstract
Frontier AI governance frameworks increasingly use cumulative training compute as the primary criterion for designating high-impact models, but enforcement rests on self-reporting because no technical verification primitive for training exists. Any future international agreement on frontier AI faces the same problem at higher stakes: coordinated regulation of technologies with significant externalities has historically rested on technical verification, without which agreements are declaratory. Recent governance analyses judge zero-knowledge proofs a promising candidate but currently impractical at frontier scale [26, 4]. We argue the impracticality is paradigm-bound rather than fundamental, and propose a verification architecture for frontier dense pre-training combining a pre-committed training specification, inter-node network observations, and on-the-fly Merkle commitments of intermediate computation, verified through a zero-knowledge Virtual Machine (zkVM) with native BF16/FP32 precompiles. The proof checks the actual floating-point computation the GPU performed rather than a fixed-point approximation, and preserves model-architecture confidentiality through a private training specification. The protocol produces three proof types: a genesis proof at initialisation, in-training step proofs across the run, and ex-ante attestations enforcing policy-relevant claims as running invariants, turning the training record into a governance-enforceable artefact. We estimate a deployable proof of concept within approximately 36 months at single-digit-percent training-side overhead, against a six-to-ten-year cycle for verification-grade custom silicon. Thirteen open research and engineering problems are catalogued as a research agenda for external contribution
AI Impact Assessments
(1 models)Scientific Impact Assessment
Core Contribution
This paper proposes a zero-knowledge verification architecture for frontier AI pre-training that addresses a critical governance gap: the absence of any technical mechanism to verify claims about how frontier AI models were trained. The key architectural innovation is a shift from the existing ZK-ML paradigm (which operates over finite-field proxies) to a system that verifies *actual* BF16/FP32 floating-point GPU computation through a zkVM equipped with native precompiles. The architecture combines three trust anchors: pre-committed training specifications, inter-node network observations via physical TAPs or attested SmartNICs, and on-the-fly Merkle commitments of intermediate computation.
The central technical insight enabling tractability is the "hint-and-verify" model: rather than re-executing training inside the proof system (~10^5× overhead per prior estimates), the zkVM verifies that trainer-supplied intermediate values are consistent with committed Merkle roots and declared operations. This reduces the constraint budget by roughly five orders of magnitude compared to full re-execution. Interactive per-entry sampling of GEMM outputs (k=4,605 samples for 10^{-20} miss probability at 1% deviation fraction) further compresses verification cost.
Methodological Rigor
The paper provides detailed cost analysis at Llama 3.1 405B scale, with per-layer constraint budgets, proving time estimates, and training-side overhead breakdowns. The eight-precompile catalogue is well-specified, with constraint counts grounded in IEEE 754 decomposition. The three-proof protocol (genesis, in-training step, ex-ante attestation) is structurally sound.
However, several important caveats temper confidence. The per-MAC constraint count (~90) is estimated with a 50-150 range, meaning absolute proof times could vary by ~2×. The proving throughput assumption (~10^6 constraints/s/GPU) is acknowledged as approximate and possibly optimistic. No empirical implementation exists—the paper is entirely architectural and analytical. The determinism measurements (1.6-8.2% overhead) are on Llama 7B with 8×H100, not at frontier scale. The formal security analysis in Appendix G is carefully qualified: the authors explicitly state they do not claim a clean 2^{-128} end-to-end soundness bound, and sparse auditing provides "detection-grade" rather than universal verification.
The honest cataloguing of thirteen open problems is commendable but also reveals significant remaining gaps: no ZK proof of backpropagation has been demonstrated at any non-trivial scale (OP-3, identified as highest-risk), the wire-to-tensor mapping for NCCL is unresolved (OP-7), and the deterministic attention backward kernel remains expensive (OP-4). The 36-month deployment estimate depends on coordinated progress across all fronts.
Potential Impact
If realized, this architecture would provide a technical foundation for enforceable international AI governance agreements—analogous to what IAEA safeguards provide for nuclear non-proliferation. The paper correctly identifies that compute-threshold enforcement (EU AI Act Article 51, former US EO 14110) currently relies entirely on self-reporting, and that verification primitives historically precede credible international agreements.
The ex-ante attestation primitive is particularly valuable: binding compute budgets, training regimes, and data-content filters as running invariants transforms training records from audit trails into governance-enforceable artifacts. The compute-threshold case analysis (Section B.4.3) shows that detecting under-declaration is structurally cheaper than generic verification, requiring only ~$15-25K rather than millions.
The broader impact extends to AI safety research (verifiable training procedures), commercial AI auditing, and potentially liability frameworks. However, the architecture covers only dense pre-training—not MoE, RL post-training, or multi-datacenter training, which are increasingly load-bearing for frontier capability.
Timeliness & Relevance
The paper is exceptionally timely. With the EU AI Act in force, US policy oscillating on compute thresholds, and international discussions on AI governance accelerating, the absence of verification technology is a binding constraint. The paper explicitly positions itself against the 6-10 year timeline for hardware-based verification (TEE-attested accelerators, HBM monitoring), arguing software-based ZK verification is achievable in ~36 months.
The comparison to arms control verification regimes (IAEA, OPCW, BWC) is apt and strategically important. The BWC's status as a "counterexample"—in force since 1975 with no verification regime—underscores the paper's argument that verification primitives must exist before negotiations begin.
Strengths
1. Paradigm-shifting architecture: Moving from finite-field proxy computation to native floating-point verification is a genuine conceptual advance over prior ZK-ML work.
2. Comprehensive cost analysis: The per-layer breakdown, challenge-mix costing, and overhead tables at realistic scale provide actionable engineering targets.
3. Honest risk assessment: The thirteen open problems with success criteria, especially OP-3's identification as potentially fatal, demonstrate intellectual honesty.
4. Governance integration: The ex-ante attestation design directly maps to existing regulatory frameworks, not just abstract security properties.
5. Defense-in-depth: The three-layer security argument (network anchor + Merkle commitments + interactive sampling) provides complementary coverage.
Limitations
1. No implementation: All claims are architectural estimates without empirical validation. The gap between blueprint and working system is substantial.
2. Backpropagation unproven: OP-3 explicitly states that a negative result would "require redesigning the verification protocol."
3. Scope limitations: Dense pre-training only; MoE and RL post-training are additive extensions in principle but each introduces significant complexity.
4. Trust assumptions: The network anchor (especially Tier 2) relies on SmartNIC firmware integrity; the TAP key custody model has acknowledged gaps.
5. Adversary model gaps: SDC/adversarial distinction (OP-9), intra-node NVLink invisibility, and the sequential-only ZK guarantee limit the security claim.
6. Scalability uncertainty: Extrapolating from 7B measurements to 405B+ introduces significant uncertainty in overhead estimates.
Overall Assessment
This is an ambitious, well-structured architectural proposal that could fundamentally reshape AI governance if its open problems are resolved. The contribution is primarily conceptual and architectural rather than empirical. Its impact depends critically on whether OP-3 (ZK proof of backpropagation) yields a positive result and whether the 36-month timeline proves realistic.
Generated Jun 5, 2026
Comparison History (27)
Paper 2 addresses a critical bottleneck in global AI governance by proposing a technical solution for verifying training compute without compromising model confidentiality. Its interdisciplinary approach, bridging cryptography, AI, and policy, offers profound real-world implications for international regulation and AI safety, giving it broader and more urgent global impact than Paper 1's focus on LLM agent self-evolution.
Paper 2 offers a broadly applicable theoretical unification of widely used methods (goal-conditioning, Decision Transformers, rejection sampling with SFT), proving an exact optimization objective and providing interpretable identities and guarantees. This can immediately influence algorithm design, diagnostics, and safety considerations across RL, imitation learning, and offline RL, with high methodological rigor and timeliness. Paper 1 is novel and potentially important for governance, but depends on substantial engineering feasibility (zkVM + FP verification at frontier scale) and its impact is more contingent on adoption and implementation details.
Paper 1 addresses a fundamental gap in frontier AI governance—the lack of technical verification for training compute claims—proposing a novel zero-knowledge proof architecture with concrete engineering milestones. It has broad impact across AI policy, international governance, cryptography, and hardware design, analogous to verification regimes in nuclear nonproliferation. Paper 2 identifies an interesting vulnerability in LLM alignment (the 'Safety Paradox'), but it is narrower in scope, primarily contributing to the adversarial robustness/alignment literature. Paper 1's potential to enable enforceable international AI agreements gives it substantially greater real-world and cross-disciplinary impact.
Paper 1 targets a high-leverage, under-solved problem—verifiable enforcement of frontier AI training compute—introducing a concrete zkVM-based architecture with floating-point fidelity and a clear research agenda. If realized, it could materially change AI governance, auditing, and international compliance, with broad cross-field impact (cryptography, systems, ML, policy) and strong timeliness given regulatory momentum. Paper 2 is valuable and more immediately applicable, but sits in a crowded space of LLM-assisted optimization; its novelty is incremental and likely to yield narrower, domain-specific gains compared to Paper 1’s potential step-change.
Paper 2 is more novel and broadly impactful: it proposes a new technical primitive (zero-knowledge verification of frontier AI training) with major real-world governance and security applications and cross-field relevance (cryptography, systems, ML, policy). It is timely given current regulation discussions. While Paper 1 shows strong empirical gains for brain-to-image decoding, it is a narrower incremental advance leveraging an existing pretrained encoding model and limited to specific fMRI datasets. Paper 2’s methodological plan is less validated but its potential societal and scientific impact is larger.
Paper 1 addresses a fundamental gap in AI governance—the lack of technical verification for frontier AI training compliance—proposing a novel zero-knowledge proof architecture that could underpin future international AI agreements. Its impact spans AI policy, cryptography, hardware verification, and international governance, with clear real-world applications analogous to nuclear nonproliferation verification. Paper 2 presents an incremental application of LLMs to agent-based epidemiological modeling, which, while useful, represents a more incremental contribution with narrower impact. Paper 1's timeliness given rapid AI governance developments and its potential to enable enforceable international agreements give it substantially higher impact potential.
Paper 2 targets a timely, high-stakes problem—verifiable governance of frontier AI training—introducing a novel zk-based architecture (zkVM with native BF16/FP32, Merkle commitments, multi-stage proofs) with potentially broad applications across ML systems, security, cryptography, and policy enforcement. If realized, it could enable new regulatory and auditing capabilities and reshape incentives around compute reporting. Paper 1 is a solid algorithmic advance for specialized longest-path search, but its impact is narrower and primarily within heuristic search/optimization, with incremental empirical gains rather than a cross-domain platform shift.
Paper 2 has higher potential impact: it proposes a novel, technically specified zero-knowledge verification architecture for frontier-model training with clear governance and security applications, potentially enabling enforceable regulation and auditing across the AI ecosystem. Its approach is timely and broadly relevant across cryptography, distributed systems, ML infrastructure, and policy. While methodological feasibility remains to be proven, it lays out concrete protocol components, proof types, and an explicit research agenda. Paper 1 is a strong benchmark contribution, but its impact is narrower and incremental relative to the rapidly growing benchmark landscape.
Paper 1 tackles a critical and highly timely problem in global AI governance: technical verification of frontier AI training without compromising intellectual property. By combining zero-knowledge proofs with hardware-level computation verification, it provides a foundational solution for international AI regulation. Its cross-disciplinary impact spanning AI safety, cryptography, and public policy is significantly broader and more consequential than Paper 2's domain-specific framework for Reddit-based LLM persona adaptation.
Paper 1 targets a foundational, high-leverage problem—verifiable enforcement of frontier AI training compute—introducing a novel zk-based architecture (zkVM with BF16/FP32, Merkle commitments, multi-stage proofs) with broad implications for AI governance, security, and cryptographic systems design. If realized, it could enable international-scale compliance verification and reshape regulatory practice. Paper 2 is practically useful but incremental: an ensemble of shallow models for prompt-injection/jailbreak detection with limited benchmark evidence (small blind set) and narrower cross-field impact, competing with rapidly evolving larger-model approaches.
Paper 1 addresses a fundamental unsolved problem in AI governance—technical verification of training compute—with a novel zero-knowledge proof architecture that could underpin international AI agreements. Its interdisciplinary impact spans cryptography, AI policy, hardware verification, and international relations. The work is highly timely given rapid frontier AI development and emerging regulatory frameworks. Paper 2, while useful, addresses a narrower engineering problem (skill selection in personal agents) with more incremental contributions. Paper 1's potential to enable enforceable AI governance makes its societal and scientific impact substantially greater.
Paper 1 addresses a fundamental unsolved problem in AI governance—technical verification of training compute claims—with a novel zero-knowledge proof architecture. It has transformative implications for international AI regulation, analogous to nuclear verification regimes. The breadth of impact spans cryptography, AI policy, hardware security, and international relations. Paper 2, while practically valuable, represents an incremental advance in knowledge distillation for autonomous driving VLMs. Paper 1's potential to enable enforceable AI governance frameworks gives it significantly broader and longer-lasting scientific and societal impact.
Paper 1 targets a foundational bottleneck in frontier AI governance: verifiable enforcement of training-compute–based regulation. Its proposed zk-based architecture (floating-point faithful proofs, confidentiality-preserving specs, step/genesis/invariant proofs) is more novel and system-level, with broad cross-field impact (cryptography, distributed systems, ML infrastructure, policy). If feasible, it enables real-world regulatory and auditing mechanisms with high societal relevance and timeliness. Paper 2 is practically useful for attribution, but likely narrower, more vulnerable to adversarial adaptation, and less transformative across domains.
Paper 1 has higher potential impact: it targets a major unsolved, time-critical governance bottleneck (verifiable frontier training) and proposes a technically novel zkVM-based architecture that could enable enforceable regulation and auditing across the AI ecosystem. If feasible, it would influence security, cryptography, distributed systems, hardware, and policy, with substantial real-world applications and broad cross-field relevance. Paper 2 offers useful, incremental evaluation metrics with practical tooling, but its novelty and transformative potential are lower and impact is likely narrower (agent benchmarking/observability).
Paper 1 addresses a critical bottleneck in global AI governance: technical verification of frontier AI training. By proposing a novel zero-knowledge proof architecture that overcomes current scalability barriers, it offers a foundational primitive for international AI regulation and compute monitoring. Its intersection of cryptography, systems engineering, and policy gives it a wider breadth of impact and higher geopolitical relevance compared to Paper 2, which offers a valuable but more incremental contribution to LLM agent safety.
Paper 1 addresses a critical unsolved problem in frontier AI governance—technical verification of training compute—with a novel zero-knowledge proof architecture. It has enormous potential real-world impact on international AI regulation and safety agreements, drawing parallels to nuclear verification regimes. The technical contribution (zkVM with native BF16/FP32 precompiles, three-proof protocol) is highly innovative and interdisciplinary, spanning cryptography, hardware, and policy. Paper 2 proposes a useful educational competency model but has narrower scope, relies only on simulated validation, and addresses a less consequential problem.
Paper 2 has higher potential impact due to its novel, timely proposal for zero-knowledge verification of frontier AI training—an enabling primitive for governance, compliance, and security with broad cross-field relevance (cryptography, distributed systems, ML infrastructure, policy). If realized, it could materially change how AI training is audited and regulated, with major real-world applications. Paper 1 is methodologically rigorous and valuable for the formal methods/LLM community, but its impact is narrower (evaluation/benchmarking) and largely diagnostic rather than enabling a new capability.
Paper 1 addresses a critical, timely bottleneck in global AI governance by proposing a novel technical solution (zkVMs) to verify frontier AI training compute. Its interdisciplinary approach bridges cryptography, systems engineering, and technology policy, offering massive real-world implications for international AI regulation. While Paper 2 provides valuable insights into LLM reasoning unfaithfulness, its impact is largely confined to NLP pipeline engineering, whereas Paper 1 tackles a foundational challenge for the safe scaling and regulation of artificial intelligence globally.
Paper 2 addresses a critical and timely gap in AI governance—technical verification of frontier AI training compliance—with a concrete cryptographic architecture. Its potential real-world impact is enormous: enabling enforceable international AI agreements analogous to nuclear verification regimes. The paper combines novel application of zero-knowledge proofs to GPU-level floating-point computation, offers a clear research roadmap (13 open problems), and has broad cross-disciplinary relevance (cryptography, hardware, policy, ML). Paper 1, while interesting, proposes a largely theoretical motivational framework for conversational AI with limited empirical validation and narrower impact scope.
Paper 2 has higher potential scientific impact because it introduces a novel, timely verification primitive for frontier AI training that could underpin enforceable governance and international agreements—an application with broad cross-field consequences (cryptography, systems, ML, policy). If realized, it changes how training claims are audited and could become infrastructure-level. Paper 1 is a strong engineering contribution to LLM-based AutoML/search with clear applicability, but it is more incremental within a fast-moving area and its impact is likely narrower and less durable than a widely adopted training-verification standard.