Remember the Decision, Not the Description: A Rate-Distortion Framework for Agent Memory
Mingxi Zou, Zhihan Guo, Langzhang Liang, Zhuo Wang, Qifan Wang, Qingsong Wen, Irwin King, Lizhen Qu
Abstract
Long-horizon language agents must operate under limited runtime memory, yet existing memory mechanisms often organize experience around descriptive criteria such as relevance, salience, or summary quality. For an agent, however, memory is valuable not because it faithfully describes the past, but because it preserves the distinctions between histories that must remain separated under a fixed budget to support good decisions. We cast this as a decision-centric rate-distortion problem, measuring memory quality by the loss in achievable decision quality induced by compression. This yields an exact forgetting boundary for what can be safely forgotten, and a memory-distortion frontier characterizing the optimal tradeoff between memory budget and decision quality. Motivated by this decision-centric view of memory, we propose DeMem, an online memory learner that refines its partition only when data certify that a shared state would induce decision conflict, and prove near-minimax regret guarantees. On both controlled synthetic diagnostics and long-horizon conversational benchmarks, DeMem yields consistent gains under the same runtime budget, supporting the principle that memory should preserve the distinctions that matter for decisions, not descriptions.
AI Impact Assessments
(1 models)Scientific Impact Assessment: "Remember the Decision, Not the Description: A Rate-Distortion Framework for Agent Memory"
1. Core Contribution
This paper reframes the agent memory problem from a descriptive compression task to a decision-centric compression problem. The key insight is that memory should preserve distinctions between histories that lead to different optimal actions, rather than preserving descriptive fidelity. The authors formalize this through three main contributions:
The NP-completeness result (Theorem 3) for optimal memory partitioning further motivates the greedy online approach. The conceptual reframing—"remember the decision, not the description"—is both memorable and actionable.
2. Methodological Rigor
The theoretical development is thorough and well-structured. The progression from the K-state memory constraint → decision distortion → forgetting boundary → covering/packing bounds → computational hardness → online algorithm → regret guarantees follows a natural logical chain.
Strengths of the formal analysis:
Concerns:
3. Potential Impact
Theoretical impact: The decision-centric rate-distortion framing offers a principled alternative to the prevalent descriptive-similarity paradigm in agent memory. The forgetting boundary and memory-distortion frontier provide vocabulary and formal tools that could influence how the community thinks about memory budgets. The connection to state abstraction in RL (bisimulation, homomorphisms) is well-drawn and could stimulate cross-pollination.
Practical impact: The empirical results are strong. On LoCoMo, DeMem achieves 91.1% (GPT-4o-mini) vs. 88.8% for the next best method (Mnemis). The modularity result (Appendix E.12) showing that decision-aware selection can be dropped into RAG (+8.8%) and EMem-G (+6.2%) as a component is particularly compelling for adoption. Results generalize across GPT-4o-mini, GPT-4.1-mini, and Llama-3.1-70B backbones.
The mismatch analysis (Appendix E.13) is convincing: description similarity has only ρ=0.103 Spearman correlation with evidence compatibility (AUC=0.548), and 85% of description-based retrieval failures trace to evidence miss or dilution—establishing the practical relevance of the theoretical concern.
4. Timeliness & Relevance
This paper addresses a genuine bottleneck. As agents tackle longer horizons (multi-session dialogue, agentic task completion), memory management becomes critical. Recent benchmarks (LoCoMo, LongMemEval, MemoryArena) consistently show that existing memory systems fail at long-term integration. The decision-centric perspective arrives at a time when the community is actively debating how to organize agent memory beyond simple retrieval augmentation.
The paper also connects to the broader "decision-centric AI" movement, providing concrete formal tools where prior work offered programmatic guidance.
5. Strengths & Limitations
Key strengths:
Notable limitations:
Additional Observations
The paper is exceptionally well-organized given its density—50+ pages of appendices with careful documentation of every design choice, ablation, and diagnostic. The alignment table (Appendix D.8) mapping theory to implementation is a model of transparency. The error attribution analysis (Table 14) closing the diagnostic chain from weak proxy → retrieval gap → downstream failure is particularly well-executed.
The work's broadest contribution may be conceptual: establishing that memory quality should be measured by decision preservation rather than information preservation, with formal backing for when this distinction matters.
Generated May 12, 2026
Comparison History (25)
Paper 2 (KISS) has higher potential scientific impact due to its broader real-world applicability and interdisciplinary reach. It addresses a critical democratization challenge—making complex Earth science simulation models accessible to communities most affected by climate risk. The empirical validation across 119 knowledge infrastructures spanning 14 Earth-science domains demonstrates remarkable generalizability. While Paper 1 offers elegant theoretical contributions (rate-distortion framework for agent memory), its impact is more narrowly scoped to the AI/agent memory community. Paper 2's potential to transform how scientific simulation knowledge is shared and operationalized across diverse communities gives it substantially wider societal and scientific impact.
Paper 2 addresses a fundamental theoretical question about the reliability of world models in RL, establishing formal connections between reward hacking and model exploitation with impossibility results and safe horizon bounds. This has broad implications for AI safety, model-based RL, and alignment research—all highly timely topics. While Paper 1 presents a solid contribution with a novel rate-distortion framework for agent memory, Paper 2's theoretical contributions are more foundational, applicable across a wider range of RL settings, and directly relevant to the critical AI safety discourse, giving it higher potential for cross-field impact and citations.
Paper 2 introduces a novel theoretical framework (decision-centric rate-distortion) for agent memory that is more fundamentally innovative, with formal guarantees and broad applicability across all long-horizon agent settings. While Paper 1 (PyRAG) is a solid engineering contribution improving multi-hop RAG through code-based reasoning, it is more incremental within an already crowded RAG optimization space. Paper 2's theoretical grounding—exact forgetting boundaries, memory-distortion frontiers, and near-minimax regret guarantees—provides deeper foundational insights that could reshape how memory is designed across diverse AI agent architectures.
Paper 1 introduces a novel theoretical framework (rate-distortion theory for agent memory) with formal guarantees, addressing a fundamental problem in long-horizon language agents. It provides both theoretical contributions (exact forgetting boundary, memory-distortion frontier, near-minimax regret guarantees) and practical algorithms (DeMem). This principled reconceptualization of memory from descriptive to decision-centric has broad implications across AI, cognitive science, and reinforcement learning. Paper 2, while practical and timely, is more incremental—combining existing components (LLM agents, physics simulators) in a pipeline without deep theoretical novelty.
Paper 1 offers a more foundational, novel framing: decision-centric memory as a rate–distortion problem, yielding principled boundaries/frontiers and an online algorithm with near-minimax regret guarantees. This combination of theory + provable algorithmic contributions + applicability to long-horizon agents suggests broad impact across RL, information theory, and agent systems. Paper 2 is timely and practically valuable for LLM safety robustness, but is closer to an incremental alignment/RL recipe (on-policy recovery) with impact likely concentrated in safety engineering and dependent on empirical benchmarks, with less general theoretical advance.
Paper 1 introduces a fundamental, theoretically grounded framework (rate-distortion) for a critical bottleneck in AI: long-horizon agent memory. Its formulation offers a principled shift from descriptive to decision-centric memory, backed by near-minimax regret guarantees. While Paper 2 provides valuable empirical insights into evaluation biases of LLM-as-a-judge, Paper 1's foundational approach to agent architecture is likely to spur broader methodological innovations and long-term advancements across reinforcement learning and language agents.
Paper 2 has higher estimated impact: it introduces a broadly applicable, principled rate–distortion formulation of agent memory tied directly to decision quality, plus an online algorithm (DeMem) with near-minimax regret guarantees and demonstrated benchmark gains. This combines novelty with methodological rigor and clear real-world relevance to long-horizon LLM agents across many domains. Paper 1 is strong and timely for large-scale multi-agent social simulation and attribution, but its impact is narrower (specific to MAS attribution and nonlinear macro indicators) and depends more on access to million-agent settings/data, limiting breadth of adoption.
Paper 1 introduces a fundamental, theoretically grounded rate-distortion framework for agent memory with proven near-minimax regret guarantees. This decision-centric paradigm represents a significant theoretical and architectural shift with broad applicability across LLMs and reinforcement learning. Paper 2 offers a valuable but more narrowly focused evaluation framework for measuring terminal commitment in embodied agents, making Paper 1's foundational contribution likely to have a wider and deeper scientific impact.
Paper 1 introduces a novel theoretical framework (decision-centric rate-distortion) for agent memory that provides fundamental insights—exact forgetting boundaries, memory-distortion frontiers, and near-minimax regret guarantees. This principled reformulation of memory for language agents has broad applicability across agent architectures and decision-making domains. Paper 2, while valuable as a benchmark, is more incremental—it extends evaluation methodology to workspace tasks. Benchmarks have impact but are more easily superseded, whereas Paper 1's theoretical contributions and the DeMem algorithm offer lasting conceptual and practical advances for the growing field of long-horizon agents.
Paper 1 addresses the highly timely area of long-horizon language agents. By grounding agent memory in rate-distortion theory rather than heuristics, it offers a rigorous, novel, and broadly applicable framework. Its combination of theoretical guarantees and empirical improvements gives it a higher potential for widespread adoption in modern AI compared to Paper 2, which offers a valuable but more niche theoretical advancement in Boolean function representations and formal logic.
Paper 1 offers a more foundational contribution by recasting agent memory as a decision-centric rate-distortion problem, providing theoretical guarantees (exact forgetting boundary, memory-distortion frontier, near-minimax regret bounds) alongside practical algorithms. This principled framework has broader impact potential across memory systems, information theory, and agent design. Paper 2 presents a solid engineering contribution with weighted graph traversal and RL-based optimization, but is more incremental in nature—combining existing techniques (GNNs, RL, vector search) without the same theoretical depth or conceptual novelty that could reshape how the field thinks about agent memory.
Paper 2 is more impactful due to a clearer conceptual reframing (decision-centric memory as rate–distortion), stronger methodological rigor (formal forgetting boundary, memory–distortion frontier, near-minimax regret guarantees), and broader applicability across long-horizon agent design, RL, compression, and systems. Its contributions are likely to generalize beyond specific LLMs and directly influence practical memory architectures under runtime constraints. Paper 1 provides useful diagnostics for calibration pockets, but is explicitly exploratory, partly falsifies its human hypothesis, and its impact is narrower (evaluation/benchmarking) with less theoretical grounding.
Paper 2 offers a foundational, mathematically grounded approach to agent memory by framing it as a rate-distortion problem. Its rigorous theoretical contributions, including near-minimax regret guarantees and a shift from descriptive to decision-centric memory, provide a highly generalizable framework. While Paper 1 presents a highly practical systems-level innovation for LLM agents, Paper 2's deep methodological rigor and potential to influence broader fields like reinforcement learning, cognitive modeling, and general AI give it a higher potential for lasting scientific impact.
Paper 1 offers a novel, formal decision-centric rate–distortion framework for agent memory, derives concrete optimality frontiers, and proposes an online algorithm (DeMem) with near-minimax regret guarantees plus empirical validation. This combination of theoretical rigor and actionable methodology is likely to influence work on long-horizon agents, memory systems, and RL/LLM architectures, with clear real-world applicability under runtime constraints. Paper 2 is timely and potentially broad but is primarily conceptual/interpretive with less methodological or empirical grounding, making its direct scientific/technical impact less predictable.
Paper 2 introduces a strong conceptual shift in agent memory from descriptive to decision-centric criteria, grounded in a rigorous rate-distortion framework with theoretical guarantees. Its application to long-horizon language agents addresses a highly relevant and rapidly growing field. While Paper 1 provides a valuable optimization for an existing interpretability method (TCAV), Paper 2 offers a more fundamental methodological innovation with broader implications for AI agent design and reinforcement learning.
Paper 1 introduces a fundamentally novel theoretical framework (decision-centric rate-distortion) for agent memory that challenges existing paradigms, provides rigorous mathematical foundations including exact forgetting boundaries and near-minimax regret guarantees, and addresses a core challenge in long-horizon language agents. Its breadth of impact spans information theory, reinforcement learning, and LLM agents. Paper 2, while practically useful for urban mobility simulation, is more application-specific and incremental in its dual-LLM-agent approach. Paper 1's theoretical contributions have broader potential to reshape how memory is conceptualized across AI systems.
Paper 2 proposes a fundamental, theoretical framework for agent memory based on rate-distortion theory, offering broad applications across artificial intelligence, reinforcement learning, and language agents. Its rigorous mathematical grounding (minimax regret guarantees) and paradigm shift from descriptive to decision-centric memory provide high potential for broad scientific impact. Paper 1, while innovative in its use of the L3 rule for PPI prediction, addresses a narrower domain within computational biology and bioinformatics.
Paper 2 is likely higher impact: it introduces a principled, general rate-distortion formulation of agent memory centered on decision quality, yielding theoretical objects (forgetting boundary, memory–distortion frontier) and an online algorithm (DeMem) with near-minimax regret guarantees plus empirical validation on agent benchmarks. This combination of novelty, rigor, and broad applicability spans RL, LLM agents, information theory, and systems, and is timely given rapidly growing interest in long-horizon agents under tight context/memory limits. Paper 1 is strong but more domain-specific to molecular optimization workflows.
Paper 1 introduces a rigorous theoretical framework (rate-distortion) to agent memory with mathematical guarantees, offering broad, fundamental impact on reinforcement learning and LLM agent design. In contrast, Paper 2 is heavily tied to a specific engineering protocol (MCP) and focuses on integrating existing technologies, making its theoretical novelty and long-term scientific impact lower.
Paper 1 introduces a fundamental theoretical framework for agent memory using rate-distortion theory, supported by provable regret guarantees. By shifting the paradigm of memory from descriptive to decision-centric, it addresses a core bottleneck in AI agent design with broad, domain-agnostic applicability. Paper 2 presents an impressive applied system for automated research, but lacks the methodological rigor and foundational algorithmic novelty of Paper 1, making Paper 1's long-term scientific impact likely higher across the broader ML and RL communities.