FIDES: Faithful Inference via Deep Evidence Signals for Retrieval-Memory Conflict in RAG
Zhe Yu, Wenpeng Xing, Tiancheng Zhao, Mohan Li, Changting Lin, Meng Han
Abstract
When retrieved evidence contradicts parametric memory, language models frequently ignore context and default to memorized priors -- a failure that undermines the core purpose of retrieval augmentation. Contrastive decoding amplifies the context-conditioned output to suppress parametric bias, but existing methods rest on an implicit assumption that this bias is uniform across tokens. A single global contrastive weight over-penalizes safe tokens while leaving genuinely conflicted ones insufficiently corrected. We identify token-level conflict concentration: retrieval-memory tension is sharply heterogeneous, concentrated on a small fraction of answer-critical decoding steps. This reframes contrastive decoding from how much contrast to apply to where to apply it. We propose FIDES (Faithful Inference via Deep Evidence Signals), a training-free decoder that reads three internal signals probing retrieval-memory conflict at complementary depths -- output surface, hidden representations, and prediction trajectory -- and fuses them to govern intervention strength at each decoding step. Across three benchmarks and six backbones -- four primary 7B/8B models and two scaling backbones up to 70B -- FIDES achieves the best context fidelity in all 18 settings, outperforming the strongest training-free baseline by +3 to +13 points. On the 70B scale, fidelity reaches 92-94% while F1 surges to 62-63%, demonstrating that token-level selectivity unlocks generation capability that coarse contrastive rules suppress.
AI Impact Assessments
(1 models)Scientific Impact Assessment: FIDES
1. Core Contribution
FIDES addresses a well-recognized problem in RAG: when retrieved evidence contradicts a model's parametric memory, LLMs often ignore the context and hallucinate from memory. The paper's key conceptual insight is token-level conflict concentration — the observation that retrieval-memory tension is not uniformly distributed across generated tokens but concentrated on a small fraction of answer-critical decoding steps. This reframes contrastive decoding from a "how much contrast" question to a "where to apply contrast" question.
The method fuses three internal signals at complementary depths: (1) Opposition (JSD between context/no-context output distributions), (2) Shift (hidden-state trajectory divergence across layers), and (3) Noise (internal prediction instability via Logit Lens midpoint-to-final KL). These are combined via fixed inverse-scale calibrated weights to produce a per-token contrastive coefficient α_t. The approach is training-free and requires no per-setting tuning.
2. Methodological Rigor
Strengths in evaluation design:
Concerns about rigor:
3. Potential Impact
Practical relevance: RAG faithfulness under knowledge conflict is a critical deployment concern. A training-free method that works across multiple model families (LLaMA, Mistral, Qwen) at varying scales (7B–70B) has immediate practical applicability. The +8–11% overhead over CAD is modest.
Conceptual contribution: The token-level conflict concentration insight is the paper's most transferable contribution. This framing could influence how other researchers think about intervention granularity in decoding-time methods beyond RAG — e.g., in factuality enhancement, safety filtering, or style control.
Limitations on impact: The method is specifically designed for single-document English QA with entity-level conflicts. Multi-document, cross-lingual, and multimodal RAG remain untested. The honest scope statement — FIDES faithfully follows wrong evidence if retrieval errs — is important but limits end-to-end deployment without additional verification layers.
4. Timeliness & Relevance
The paper addresses a timely bottleneck. As RAG becomes standard in production LLM systems, the reliability of context-following behavior is a first-order concern. The training-free constraint is particularly relevant given that many deployments use frozen, API-served models. The scaling results to 70B are valuable as the field moves toward larger models.
The competitive landscape is active (CAD, AdaCAD, DeCoRe, DVD, COIECD, CoCoA, CLEAR), and the paper positions itself well within this space. The consistent gains across all 18 settings over the strongest training-free baseline are convincing.
5. Strengths & Limitations
Key strengths:
Notable weaknesses:
Summary
FIDES makes a solid contribution to the active area of RAG faithfulness through a well-motivated conceptual insight (token-level conflict concentration) and a practical, training-free implementation. The comprehensive evaluation across 18 settings with consistent improvements is the paper's strongest selling point. The main limitations are the narrow evaluation domain (entity-swap QA) and the moderate theoretical depth. This is a well-executed engineering contribution with a useful conceptual framing, likely to influence follow-up work on adaptive decoding strategies.
Generated Jun 5, 2026
Comparison History (16)
FIDES addresses a fundamental and widely-recognized problem in RAG systems (retrieval-memory conflict) with a training-free approach that works across multiple model scales and architectures. Its key insight about token-level conflict concentration is novel and reframes contrastive decoding in a principled way. The training-free nature makes it immediately applicable, and RAG is a broadly adopted paradigm. Paper 2, while solid, addresses a more niche intersection (RLVR for multimodal reasoning) with a more complex framework. FIDES's broader applicability, stronger empirical gains across 18 settings, and foundational insight give it higher impact potential.
Paper 1 addresses a timely, broadly impactful issue at the intersection of AI, psychology, and policy. Its findings—that routine AI interactions incidentally reshape human emotional support preferences—have profound implications for regulation, mental health, and society. The large-scale longitudinal study with OpenAI provides compelling empirical evidence. Its cross-disciplinary relevance (psychology, HCI, policy, ethics) and timeliness amid rapid AI adoption give it exceptionally broad impact potential. Paper 2, while technically rigorous and valuable for the NLP community, addresses a narrower technical problem (RAG faithfulness) with more limited audience and societal implications.
Paper 2 identifies a fundamental and systemic limitation in current Vision Language Models regarding physical reasoning, evaluating over 100 models. By introducing a comprehensive benchmark that exposes a deep-seated flaw (reliance on textual priors over actual visual physics), it is highly likely to steer broad future research directions and inspire new model architectures. While Paper 1 offers a valuable, practical solution to a specific RAG issue, Paper 2's exposure of a core capability deficit in foundational models promises a broader and more transformative impact across the AI community.
Paper 1 addresses a fundamental and widespread problem in Large Language Models (retrieval-memory conflict in RAG) with a novel, training-free approach applicable across various LLM backbones. Its broad applicability in the rapidly growing generative AI field promises wider impact than Paper 2, which, while methodologically sound and highly relevant to healthcare, focuses on a domain-specific application (EHR data), limiting its breadth of impact across diverse fields.
FIDES addresses a fundamental and broadly relevant problem in RAG-based LLMs—retrieval-memory conflict—with a novel, training-free approach offering strong theoretical insight (token-level conflict concentration) and rigorous evaluation across multiple scales and benchmarks. Its breadth of applicability across all LLM-based RAG systems gives it wider cross-field impact. While MapAgent demonstrates impressive industrial deployment, it is more narrowly focused on autonomous driving map generation and represents more of an engineering integration than a fundamental methodological advance.
Paper 1 introduces a broadly applicable, training-free decoding method (FIDES) that targets a central, timely failure mode in RAG—retrieval vs. parametric-memory conflict—using token-level adaptive intervention from multiple internal signals, and demonstrates consistent gains across many benchmarks and model scales up to 70B. This combination of methodological novelty, immediate deployability, and wide relevance to LLM reliability makes its likely impact higher. Paper 2 offers an important diagnostic finding about convergence in LLM-driven program evolution, but it is more domain-specific and primarily descriptive, with less direct, general-purpose intervention.
Paper 2 (FIDES) likely has higher scientific impact: it introduces a novel, broadly applicable, training-free decoding method addressing a central failure mode in RAG (retrieval–parametric memory conflict) with a clear conceptual reframing (token-level conflict concentration) and strong cross-model, multi-benchmark gains. This has immediate real-world applicability for improving faithfulness in deployed LLM systems and can influence decoding/control research broadly. Paper 1 is a useful benchmark highlighting limitations, but benchmarks alone typically have narrower downstream impact unless they drive new methods; its application scope is more evaluative than solution-oriented.
TRACE addresses a fundamental challenge in multimodal time series foundation models—temporal misalignment and modality missingness—which is pervasive across healthcare, affective computing, and many other domains. Its contribution to the rapidly growing field of foundation models for time series, combined with its broad applicability across modalities and domains, gives it wider potential impact. FIDES, while technically strong and addressing an important RAG faithfulness problem, targets a more specific issue (retrieval-memory conflict in LLM decoding) with a training-free inference-time fix that may be superseded as models improve. TRACE's paradigm for conditional estimation under missingness has more foundational, cross-disciplinary relevance.
FIDES addresses a fundamental and widely recognized problem in RAG systems—retrieval-memory conflict—with a novel insight (token-level conflict concentration) that reframes contrastive decoding. It demonstrates strong empirical results across multiple benchmarks and model scales, is training-free (enabling broad adoption), and is highly timely given the explosion of RAG applications. Paper 1 extends ReMax to continuous action spaces with solid theoretical analysis but offers more incremental contributions (comparable to SAC performance) in a narrower RL subfield. Paper 2's broader applicability to LLM deployment gives it higher impact potential.
Paper 2 has higher potential impact due to a more novel framing (typed federated artifacts as the unit of collaboration) that enables new guarantees and operations (schema-aware merging, per-field DP, cross-architecture transfer) in an important, timely setting: federation across heterogeneous, frozen LLMs without sharing data/weights. This could influence privacy-preserving ML systems, federated learning theory/practice, and tool-using LLM infrastructure broadly. Paper 1 is strong and practical for RAG fidelity, but is more incremental within decoding/control methods and narrower in cross-field reach.
Paper 2 addresses a critical and widespread issue in modern LLMs (retrieval-memory conflict in RAG). Its training-free decoding approach has immediate, broad applicability across NLP systems. Furthermore, its methodological rigor is stronger, testing across 18 settings and scaling up to 70B models, giving it a broader and more immediate real-world impact compared to the niche focus on visual spatial planning in Paper 1.
Paper 2 has higher likely impact: it is the first systematic, tool-validated evaluation of NL-to-TLA+ synthesis across 30 LLMs with a released dataset/framework, enabling reproducible benchmarking and follow-on research in formal methods, program synthesis, and LLM reliability. Its negative results and identified hallucination modes are broadly actionable for both academia and industry verification workflows. Paper 1 is a strong, practical decoding innovation for RAG faithfulness, but it is more incremental within an active subarea and is less likely to reshape evaluation standards or cross-field practice than a foundational benchmark study in formal specification generation.
Paper 2 (ALE) likely has higher impact: it introduces a large, industry-validated, long-horizon benchmark with verifiable outcomes spanning 1K+ tasks across 13 clusters, addressing a central bottleneck in AI-to-economy translation. Its breadth enables cross-field influence (agents, evaluation, economics, HCI, software engineering) and timeliness given rapid agent deployment. Methodologically, expert collaboration and a living, continuously expanding task pool increase relevance and longevity. Paper 1 is a strong, novel decoding method for RAG faithfulness, but its impact is narrower and likely incremental relative to broader evaluation infrastructure.
FIDES addresses a widely recognized and practical problem in RAG systems—retrieval-memory conflicts—with a principled, training-free solution that demonstrates strong empirical results across 18 settings and scales to 70B models. The insight about token-level conflict concentration is novel and actionable, with broad applicability to the rapidly growing RAG ecosystem. Paper 1 introduces an interesting diagnostic probe but addresses a narrower problem (detecting implicit reward hacking) with limited scale (single 3B model, one dataset), making it more of a proof-of-concept with less immediate breadth of impact.
Paper 1 exposes a fundamental paradox in LLM alignment, demonstrating that enhanced safety awareness inherently introduces new vulnerabilities. This theoretical and empirical breakthrough challenges core assumptions in AI safety, offering broad implications that could force a critical rethinking of defense mechanisms across the field. In contrast, Paper 2 provides a highly effective but more specialized algorithmic improvement for RAG systems.
Paper 2 is more novel methodologically, proposing a new training-free, token-selective decoding approach (FIDES) using internal model signals to resolve retrieval–memory conflict in RAG, with broad applicability to many LLM systems and tasks. It reports extensive benchmarking across multiple datasets and model scales up to 70B, suggesting strong rigor and immediate relevance to a fast-moving area with wide cross-field impact (NLP, IR, trustworthy AI). Paper 1 is timely and societally important but is primarily an attributional measurement study with narrower methodological innovation and more limited transferability.