Output Type Before Quality: A Standards-Derived XAI Admissibility Rubric for Autonomous-Driving Safety
Abhinaw Priyadershi, Mandar Pitale, Jelena Frtunikj, Maria Spence
Abstract
Safety standards for ML-based autonomous driving specify the kind of evidence an assurance case must contain (directed cause-and-effect chains, quantified interventional effects, named root-cause variables), yet the XAI literature is organised by output type and technique family (saliency maps, feature attribution, counterfactuals, causal graphs, language traces). SHAP, the most-recommended ADS XAI method, returns a ranked feature list that no implementation effort can convert into a directed chain (Fig.1). We name this mismatch the evidence-type gap. From AMLAS, ISO 26262, ISO21448, ISO/PAS 8800 we derive 19 testable evidentiary criteria across 7 lifecycle stages with representative clause-cited derivations and score six XAI method classes structurally. Causal XAI emerges as structurally required to satisfy the derived criteria at three stages: hazard identification (+62% rubric gap), incident investigation (+50%), and data management (+50%); the verdict set is stable across thresholds T in (0%, 50%]$ and survives a worst-case single-cell flip down to T = 25%. At the remaining four stages, correlational or language-based methods are comparable or sufficient. The rubric identifies structural admissibility (necessary but not sufficient for compliance): an admissible method's specific output content may still be wrong, and validating that fidelity (the edges a fitted SCM produces, the cause a trace names) is the open assurance challenge. A single-VLA proof of concept on 1,996 real-world driving clips (79,840 rows, ten splits) is consistent with each method's observed output type matching its rubric prediction. XAI method selection for ADS safety assurance should be driven by lifecycle-stage evidence demand, not by method popularity.
AI Impact Assessments
(1 models)Scientific Impact Assessment
Core Contribution
This paper identifies and formalizes what it terms the "evidence-type gap" — a structural mismatch between the evidentiary demands of safety standards (ISO/PAS 8800, ISO 21448, ISO 26262, AMLAS) for autonomous driving systems and the output types produced by popular XAI methods. The central insight is compelling and well-articulated: SHAP produces ranked feature lists, but standards like ISO/PAS 8800 Cl. 6.7.1 demand directed cause-and-effect chains. These are categorically different objects, and no amount of engineering refinement can transform one into the other.
The authors derive 19 testable evidentiary criteria across 7 lifecycle stages, score 6 XAI method classes against them using a Satisfies/Partial/Fails rubric, and conclude that causal XAI (specifically SCMs) is structurally necessary at three stages: hazard identification, incident investigation, and data management. The theoretical grounding in Pearl's causal hierarchy (rung-1 methods cannot answer rung-2 questions) provides a clean, principled basis for the structural impossibility claims.
Methodological Rigor
The paper has a clearly layered methodology: standards extraction → criteria formalization → structural scoring → robustness analysis → empirical proof-of-concept. Several aspects deserve scrutiny:
Strengths in rigor:
Weaknesses in rigor:
Potential Impact
The paper addresses a genuinely important practical problem at the intersection of XAI research and safety engineering. Its potential impact operates at several levels:
1. Standards compliance guidance: Safety engineers selecting XAI methods for ADS assurance cases now have a principled framework rather than defaulting to method popularity. This is directly actionable.
2. Research prioritization: The finding that causal XAI is structurally necessary at three lifecycle stages but essentially absent from the ADS XAI literature (per surveys of 84+ papers) identifies a clear research gap that could redirect community effort.
3. Cross-domain generalizability: The authors claim the rubric construction procedure is domain-general, applicable to any standard and method catalogue. If validated, this could influence safety-critical AI beyond autonomous driving (medical devices, aerospace, industrial automation).
4. Conceptual framework: The "output type before quality" principle — that checking whether a method *can* produce the right kind of evidence should precede evaluating how well it does so — is a useful conceptual contribution that reframes XAI evaluation.
However, impact may be limited by the paper's heavy reliance on one particular interpretation of standards language. Safety standards are intentionally method-agnostic, and standards bodies or certification authorities may not agree with the authors' strict Pearlian reading of "causal."
Timeliness & Relevance
This is highly timely. ISO/PAS 8800:2024 was published recently, and the autonomous driving industry is actively grappling with how to build assurance cases for ML-based systems. The explosion of VLA/foundation model deployment in ADS makes the XAI evidence question urgent. The paper arrives at a moment when practitioners need this kind of guidance.
Strengths & Limitations
Key strengths:
Notable weaknesses:
Overall Assessment
This is a conceptually valuable contribution that frames an important problem clearly and provides a structured approach to XAI method selection for ADS safety. The theoretical argument is sound — associational methods genuinely cannot produce interventional evidence. However, the work is early-stage: the rubric needs external validation, the empirical component is limited, and the practical implications of the structural necessity finding are unclear given that the authors' own SCM implementation struggles with basic recovery tasks. The paper is best understood as a well-articulated position paper with a preliminary analytical framework, rather than a fully validated methodology.
Generated Jun 5, 2026
Comparison History (16)
Paper 1 sets a broad research agenda for foundation model agents by framing their deployment challenges as a classical MDP sim-to-real gap. This unified perspective has the potential to influence a vast cross-section of the AI and agentic communities. Paper 2, while highly rigorous and valuable for autonomous driving safety compliance, is more narrowly focused in its application domain.
Paper 1 is likely to have higher impact due to its direct alignment with timely, safety-critical autonomous-driving standards and its actionable rubric that translates normative requirements into testable XAI admissibility criteria across lifecycle stages. This creates immediate real-world utility for assurance cases, compliance, and tool selection, with potentially broad uptake in industry and regulation. Methodologically, it offers a structured standards-derived evaluation plus robustness checks and a proof-of-concept study. Paper 2 is novel and rigorous, but its impact may be more academic and context-dependent, with fewer near-term standardization levers.
Paper 1 offers a more novel, standards-derived framework that directly connects XAI output types to concrete safety-assurance evidence requirements across the autonomous-driving lifecycle, producing testable criteria and robustness checks. Its potential real-world impact is high because it targets regulatory/assurance practice (ISO 26262, ISO 21448, etc.), where admissibility of evidence is a bottleneck for deployment, and it could influence both tooling and certification processes across safety-critical ML domains. Paper 2 is timely and useful as an evaluation benchmark, but negotiation leaderboards may have narrower cross-field and regulatory impact than safety-assurance admissibility criteria.
Paper 2 demonstrates higher potential impact by addressing a critical bottleneck in life-critical systems: the misalignment between XAI outputs and autonomous driving safety standards. While Paper 1 offers a valuable framework for LLM agents, Paper 2 bridges machine learning, systems engineering, and regulatory compliance. By formally deriving a rubric to evaluate XAI admissibility for safety assurance, it provides a foundational framework with immediate, high-stakes real-world applications. Its structural approach to the 'evidence-type gap' will likely heavily influence both future XAI research directions and the practical deployment and regulation of autonomous vehicles.
Paper 2 uncovers a fundamental behavioral limitation of LLMs (convergence to attractor regions) in program evolution, which broadly impacts the highly active fields of LLM-based code generation, evolutionary algorithms, and open-ended exploration. Paper 1, while highly valuable for autonomous driving safety, is more applied and regulatory-focused, mapping existing XAI methods to safety standards rather than revealing a novel underlying scientific phenomenon.
Paper 2 has higher potential impact: it introduces a standards-derived, testable rubric that directly connects XAI outputs to evidence requirements in autonomous-driving safety assurance, addressing a timely and high-stakes deployment bottleneck. The approach is novel in framing “admissibility” via explicit lifecycle-stage criteria grounded in ISO/AMLAS, offering broad applicability across safety-critical ML domains and influencing both research and regulatory/industrial practice. While Paper 1 is technically innovative for LLM agent memory and shows strong benchmark gains, its impact is more confined to agent architectures and may iterate on an active retrieval trend rather than reshape evaluation/selection norms across fields.
Paper 1 is more methodologically rigorous and technically novel: it derives a clause-cited, testable rubric from multiple safety standards, evaluates XAI classes against explicit evidentiary criteria across lifecycle stages, and includes robustness checks plus an empirical proof-of-concept. Its contributions can influence both XAI research directions (favoring causal XAI where structurally required) and safety assurance practice in autonomous driving, with potential spillover to other safety-critical ML domains. Paper 2 is timely and application-relevant but is largely conceptual/framework-based with less technical validation, making its scientific impact more uncertain.
Paper 1 bridges a critical gap between AI safety standards and XAI techniques in autonomous driving, addressing a highly relevant and urgent regulatory challenge. Its development of a standards-derived rubric offers broader implications for AI certification and policy. In contrast, Paper 2 presents an incremental application of existing neural network architectures to a niche domain (maritime trajectory prediction), which, while practically useful, lacks the transformative scientific and regulatory impact of Paper 1.
Paper 1 presents a concrete, actionable method (CMTF) with extensive empirical validation (2448 runs, 102 tasks, 4 LLM backends) that addresses a growing practical problem in LLM agent reliability. Its training-free approach, dramatic efficiency gains (~90% token reduction), and broad applicability to the rapidly expanding LLM agent ecosystem give it high near-term impact. Paper 2 contributes a useful analytical rubric but is more niche (ADS safety + XAI intersection), primarily taxonomic rather than methodological, and its empirical validation is limited to a single proof-of-concept. Paper 1's broader relevance and practical utility suggest greater impact.
Paper 1 addresses a rapidly growing and broadly applicable problem—using coding agents to optimize other agents—with a concrete benchmark and tooling (VeRO/VeRO-Bench) that enables systematic research in a high-demand area. The agent-optimizing-agent paradigm is timely and has wide applicability across AI development. Paper 2 addresses an important but narrower niche (XAI method selection for autonomous driving safety standards), producing a useful rubric but with limited breadth of impact beyond the ADS safety assurance community. Paper 1's infrastructure contribution is likely to catalyze more downstream research.
Paper 1 bridges a critical gap between XAI methods and established safety standards (ISO) in autonomous driving. Its methodological rigor, reliance on testable criteria, and immediate real-world applicability in a safety-critical domain provide a stronger foundation for near-term scientific and industrial impact compared to the more theoretical, conceptual framework proposed in Paper 2.
Paper 2 addresses a concrete, high-stakes regulatory gap in autonomous driving safety by creating a standards-derived admissibility rubric linking XAI methods to specific lifecycle-stage evidence requirements. It offers immediately actionable criteria (19 testable evidentiary criteria across 7 stages), empirical validation, and targets a rapidly growing industry with pressing regulatory needs. Paper 1 proposes an interesting theoretical framework about AI-assisted creativity with a useful taxonomy, but remains largely conceptual without empirical validation. Paper 2's methodological rigor, direct regulatory applicability, and timeliness in the booming AV/AI safety domain give it broader and more immediate impact.
Paper 2 bridges a critical gap between XAI methodologies and regulatory safety standards in autonomous driving. Its framework for aligning XAI outputs with strict compliance requirements has profound implications for AI safety, certification, and deployment in high-stakes domains, offering significantly broader cross-disciplinary impact than Paper 1's domain-specific optimization method.
Paper 2 addresses a fundamental methodological flaw in the rapidly expanding field of AI alignment and reinforcement learning. By mathematically proving a systematic bias in a commonly used metric and releasing a reusable audit harness, it provides a crucial correction that could standardize evaluations across the broader AI community. While Paper 1 offers highly valuable, domain-specific regulatory insights for autonomous driving, Paper 2's theoretical rigor and broad applicability to foundation model training give it a higher potential for widespread scientific impact.
Paper 1 addresses a critical gap between XAI methods and safety standards for autonomous driving, a high-stakes domain with enormous real-world impact. It provides a novel, actionable rubric derived from established safety standards (ISO 26262, SOTIF, AMLAS) that can guide industry practice. The breadth of impact spans AI safety, autonomous vehicles, regulation, and XAI research. Paper 2, while technically sound, addresses a narrower algorithmic problem (bidirectional search for longest paths) with more limited applicability. Paper 1's timeliness—given growing regulatory demands for AI transparency—further amplifies its potential impact.
Paper 2 likely has higher impact: it proposes a general-purpose, code-released framework that extends LLM agents to structured time-series reasoning with tools, memory, and reusable routines, validated across many domains and benchmarks—broad, timely, and readily adoptable. Paper 1 is novel and rigorous in aligning XAI outputs with safety-standards evidence needs, but its impact is more specialized (autonomous-driving assurance) and primarily provides a rubric/analysis rather than a widely reusable technical system.