Tom Beyer, Svea Wisy, Sven Tomforde
The growing complexity of self-adaptive and self-organising systems, fuelled by advances in Artificial Intelligence (AI), has made them increasingly difficult to understand and trust. While Explainable AI aims to provide insight into AI decision-making, a more advanced goal is for systems to explain themselves - an ability referred to as Self-Explainability (SX). This article presents a systematic literature review on SX, analysing existing approaches, including their domains, targets, and evaluation methods. The review develops a unified definition and taxonomy of SX and introduces Levels of Self-Explainability, providing a framework for positioning current and future research. Our results show that most SX approaches remain conceptual, with few practical implementations. Moreover, there is currently no formal or de facto standard for evaluating SX, highlighting a major research gap. This work thus establishes a foundation and roadmap for advancing Self-Explainability in complex systems.
This paper presents the first systematic literature review (SLR) on Self-Explainability (SX) — defined as a system's ability to autonomously generate and output explanations of its behavior at runtime. Starting from 507 initial publications, the authors filter down to 105 relevant papers (24 classified as genuinely SX-related), and deliver four main contributions: (1) a formal definition of SX distinguishing it from XAI and related concepts like self-awareness and self-interpretability; (2) a taxonomy organizing SX methods into classic XAI, DL-based, and innovative explainability approaches; (3) a research agenda identifying gaps (especially in evaluation); and (4) "Levels of Self-Explainability" (0–5), analogous to Levels of Autonomy, providing a maturity framework for the field.
The key conceptual innovation is the distinction between *Explanation of Models* (global) and *Explanation of Behaviour* (local), with SX defined as a subcategory of the latter — specifically requiring autonomous generation at runtime. The three-part definition (grounds → cause → effects) provides useful structure. The Levels of SX framework, while inspired by Levels of Autonomy, is novel in this domain and provides a concrete roadmap.
The SLR follows Kitchenham and Charters' guidelines, which is appropriate for software engineering reviews. The search strategy covers four major databases (ACM DL, IEEE Xplore, ScienceDirect, Springer Nature Link), with clearly documented search strings and reproducible inclusion/exclusion criteria. Two-author independent screening with third-author arbitration is good practice.
However, there are notable limitations. The search string design may miss relevant work: the reliance on "self-" prefixed terms could exclude papers on runtime explanation generation that don't use this specific terminology. The exclusion of Springer Nature Link's non-CS disciplines and book chapters may overlook relevant interdisciplinary work, particularly from HCI, cognitive science, or philosophy of explanation. The restriction to post-2000 publications is reasonable but stated rather than justified analytically.
The final count of 24 SX papers is quite small, which limits the statistical robustness of any trend analysis. The assignment of papers to Levels of SX involves subjective judgment, particularly for conceptual papers listed with parenthetical levels. The taxonomy, while useful, emerges inductively without formal methodology (e.g., no inter-rater reliability metrics for the classification).
The paper addresses a genuine need: as autonomous systems proliferate, the gap between XAI (explaining AI models) and true self-explanation (systems autonomously explaining their behavior) is increasingly important. The conceptual framework could influence:
The practical impact is currently limited by the field's immaturity — as the authors note, most SX approaches remain conceptual, and none exceed Level 2 in practice. The framework's value will depend on adoption by the community.
The paper is highly timely. The explosion of LLM-based autonomous agents, autonomous vehicles, and smart infrastructure creates urgent need for systems that can explain themselves. The EU AI Act and similar regulations increasingly require explanations for high-risk AI systems, making SX a practical necessity rather than an academic curiosity. The identification of LLM-based approaches as a promising direction for SX is well-aligned with current technological trends.
The concurrent SLR by Straub et al. (2026, cited as [148]) on explainability in self-adaptive systems suggests this is an area attracting systematic attention, making this contribution part of a broader consolidation effort.
This is a solid, well-structured systematic review that provides necessary conceptual infrastructure for an emerging field. Its primary value lies in definitional clarity and the Levels framework, which could become reference points if adopted. However, the contribution is primarily organizational and definitional rather than technically innovative. The field it surveys is so nascent that the review necessarily operates at a high level of abstraction. The paper would have been strengthened by proposing concrete evaluation metrics or demonstrating the framework's utility through a case study.
Generated Jun 9, 2026
ComBench addresses a timely and concrete gap in LLM evaluation for mathematical reasoning, providing a reusable benchmark with clear metrics that the rapidly growing AI/LLM community can immediately adopt. It offers novel insights distinguishing proof reasoning from constructive realization capabilities across frontier models. Paper 1, while valuable as a systematic literature review on self-explainability, primarily synthesizes existing work and identifies research gaps without introducing new methods or tools. Benchmarks tend to have outsized impact by shaping research directions and enabling reproducible comparisons across the field.
Paper 2 addresses a highly critical and timely issue in modern AI: LLM provenance and attribution in black-box settings. By introducing a novel empirical framework (READER) to identify source models, it offers immediate practical applications for security, copyright, and operational transparency. In contrast, Paper 1 is a systematic literature review that, while useful for defining taxonomy, lacks the empirical innovation and immediate real-world utility demonstrated by Paper 2's methodological contributions to AI safety and governance.
Paper 1 provides a systematic literature review establishing a unified definition, taxonomy, and levels framework for Self-Explainability in complex systems—a foundational contribution with broad impact across AI, self-adaptive systems, and explainability research. It identifies major research gaps and provides a roadmap for future work. Paper 2, while technically solid, addresses a narrower engineering optimization problem (speeding up NeurASP), which, despite practical value, has more limited scope and impact potential. Paper 1's conceptual framework is likely to be widely cited and influence multiple research directions.
Paper 1 establishes a foundational taxonomy and roadmap for a highly timely field (Self-Explainability in AI). Systematic reviews that define terminology and outline research directions in broad, fast-growing areas typically achieve higher cross-disciplinary impact and citation counts than specific algorithmic improvements. While Paper 2 offers rigorous, state-of-the-art results for a classic problem (TSP), its impact is largely confined to the niche of neural combinatorial optimization, whereas Paper 1 addresses trust and explainability applicable across numerous AI domains.
Paper 1 addresses the broader and more foundational topic of Self-Explainability in complex adaptive systems, providing a systematic literature review, unified taxonomy, and research roadmap that can influence multiple fields (AI, robotics, distributed systems, etc.). Its breadth of impact and timeliness given the AI trust crisis give it higher potential. Paper 2, while practically useful, addresses a narrower engineering problem (LLM manuscript verification) with an incremental architectural contribution and limited generalizability beyond clinical manuscript preparation.
Paper 1 presents a concrete, novel methodology (synthetic contrastive reasoning traces with CPO for multi-table QA) with strong empirical results showing 9.7-16.3% absolute improvements across multiple models. It addresses a clear gap in reasoning supervision and provides reproducible, quantitative contributions. Paper 2 is a systematic literature review that, while useful for organizing the SX field, primarily synthesizes existing work and proposes taxonomies rather than novel technical contributions. Paper 1's methodological innovation and demonstrated empirical gains suggest broader and more immediate impact on the active LLM reasoning and table QA research communities.
Paper 1 addresses a broader and more timely challenge—Self-Explainability in complex AI systems—which intersects multiple high-impact fields (XAI, self-adaptive systems, trustworthy AI). Its systematic literature review, unified taxonomy, and proposed Levels of Self-Explainability provide a foundational framework that can guide substantial future research across domains. It also identifies critical research gaps. Paper 2, while technically solid in GPU-accelerated SAT solving, targets a narrower community (constraint solving/optimization) and represents an engineering advancement over an existing proof-of-concept rather than a conceptual breakthrough, limiting its breadth of impact.
Paper 1 presents a systematic literature review establishing a unified definition, taxonomy, and levels framework for Self-Explainability in complex systems—a foundational contribution with broad cross-disciplinary impact spanning AI, self-adaptive systems, and trustworthy computing. It identifies major research gaps and provides a roadmap for future work. Paper 2, while methodologically rigorous with its POMDP formalization and empirical evaluation of agent orchestration paradigms, addresses a narrower problem (tool-use in customer service workflows) with more incremental findings. Paper 1's broader scope and framework-setting nature give it higher potential for lasting scientific impact.
Paper 1 is likely to have higher scientific impact due to its concrete theoretical advance (a full variational characterization of EFE-based planning), clear methodological rigor (proofs + explicit correction terms), and actionable algorithmic output (message-passing scheme) validated empirically. This combination can directly influence active inference, probabilistic inference, and planning/control research. Paper 2 is timely and broadly relevant, but as a systematic review/taxonomy it is less methodologically innovative and mainly consolidates existing work; its impact depends on community uptake of proposed standards.
Paper 1 has broader, more timely impact: it tackles trust and understanding of AI-driven self-adaptive systems, provides a unified definition, taxonomy, and “levels” framework, and identifies evaluation standardization as a key gap—likely to shape future research agendas across software engineering, autonomous systems, and XAI. Although largely survey/framework-oriented, such unifying work can catalyze cross-field adoption and guide methodology. Paper 2 is technically novel and rigorous within heuristic bidirectional search, with clear performance gains, but its impact is narrower to search/planning communities.