Interaction Locality in Hierarchical Recursive Reasoning

Yosuke Miyanishi, Tetsuro Morimura

May 20, 2026

arXiv:2605.20784v1 PDF

cs.AI(primary)cs.LG

#1248of 2292·Artificial Intelligence

#1248 of 2292 · Artificial Intelligence

Tournament Score

1401±43

10501800

61%

Win Rate

Wins

Losses

Matches

Rating

4.8/ 10

Significance5

Rigor6

Novelty5.5

Clarity5

Tournament Score

1401±43

10501800

61%

Win Rate

Wins

Losses

Matches

Rating

4.8/ 10

Significance

Rigor

Novelty

Clarity

Abstract

Spatial reasoning requires both location-bound computation and location-invariant structure: agents must make local moves while preserving route, object, or constraint-level plans. We propose interaction locality, a task-geometry-aware framework for measuring whether information flow stays within nearby cells or semantic segments, or crosses them. We instantiate the framework with sparse-autoencoder feature ablations and finite-noise activation patching, with structural Jacobian and attention checks reported in the appendix, and apply it to HRM and TRM, two compact hierarchical and recursive reasoning models, on Maze-Hard, Sudoku Extreme, and ARC-AGI. Across these models, activation patching gives the clearest architectural fingerprint: high-level recurrent states tend to write information within nearby cells or same-segment units, while repeated recursive updates accumulate these local writes into broader solution structure. This pattern holds across maze paths, Sudoku constraints, and ARC-AGI object neighborhoods, with the strongest concentration in TRM. To test whether interaction locality extends beyond toy-yet-challenging grid benchmarks, we also apply it to MTU3D, a large-scale embodied 3D scene-grounding model. In this MTU3D setting, causal spatial locality appears primarily at the transition where visual scene features are handed to the downstream grounding module, rather than uniformly throughout the visual encoder. This contrast suggests that the local-to-global handoff observed in HRM and TRM is tied to explicit recursive reasoning dynamics, while embodied 3D models may concentrate causal spatial structure at module boundaries. Interaction locality turns the intuitive local-execution/global-planning story into a reproducible measurement framework for recursive and embodied spatial reasoning.

AI Impact Assessments

(1 models)

Scientific Impact Assessment: "Interaction Locality in Hierarchical Recursive Reasoning"

1. Core Contribution

The paper proposes interaction locality, a measurement framework for assessing whether internal information flow in spatial reasoning models respects task-defined geometric neighborhoods (e.g., maze corridors, Sudoku boxes, ARC-AGI object regions). The framework is instantiated with three probe types: sparse autoencoder (SAE) feature ablations, finite-noise activation patching (the primary causal tool), and structural Jacobian/attention checks (appendix). It is applied to HRM and TRM—two compact hierarchical/recursive reasoning architectures—across Maze-Hard, Sudoku Extreme, and ARC-AGI, plus MTU3D, a 3D embodied grounding model on ScanNet.

The central finding is that H-level (high-level) recurrent states tend to write information *locally* within task-defined neighborhoods, while cross-cycle propagation accumulates these local writes into broader structure. This revises the naive "H = global, L = local" narrative. The MTU3D extension shows that causal spatial locality concentrates at the visual-to-grounding handoff rather than uniformly through the encoder, suggesting the local-to-global handoff is specifically tied to recursive reasoning dynamics.

2. Methodological Rigor

Strengths in methodology:

The paper carefully distinguishes between structural (Jacobian, attention) and causal (activation patching) evidence, appropriately emphasizing that structural bias does not imply causal locality. The MTU3D dissociation where structural attention locality exists without causal recovery locality is a genuinely informative finding.

Reliability diagnostics (self-drop calibration, noise-scale selection) are thoughtfully designed. The 30% self-drop target and SNR=1 calibrations are explicitly stated.

The triangulation across multiple probes (Table 3) is well-organized, with clear delineation of what each probe can and cannot show.

Confidence intervals are reported throughout with bootstrap methods.

Weaknesses:

Sample sizes are modest: n=30–50 for patching experiments, 30 ScanNet scenes for MTU3D. While bootstrap CIs are provided, the small samples limit generalizability claims.

The paper analyzes *released checkpoints* rather than conducting controlled training sweeps. This means the findings describe properties of specific trained models rather than establishing that interaction locality is an intrinsic property of the architectures.

The ARC-AGI comparison is confounded: HRM uses ARC-AGI-2 and TRM uses ARC-AGI-1, acknowledged but not resolved.

The neighborhood definitions are somewhat arbitrary (distance ≤1 along maze path, same 3×3 box for Sudoku). The paper acknowledges this but doesn't systematically explore sensitivity to neighborhood definition, though Section J begins addressing row/column constraints.

The SAE training details (512→2048 dictionary, λ₁=10⁻³) are briefly stated without ablation studies on these hyperparameters.

3. Potential Impact

The framework addresses a genuine gap: mechanistic interpretability for spatial reasoning models lacks a unified coordinate system for comparing locality across tasks and architectures. The idea of measuring information flow against task geometry is conceptually clean and could become a useful diagnostic tool.

Practical applications include: (1) diagnosing whether recursive reasoning models develop the intended local-to-global computation, (2) guiding locality-aware training objectives (though none are evaluated here), and (3) providing a common language for comparing spatial reasoning across domains (grids, 3D scenes, etc.).

However, impact is limited by: the niche scope of the models studied (HRM/TRM are compact research models, not widely deployed systems), the absence of training-time experiments showing that locality diagnostics actually improve model design, and the relatively descriptive nature of findings—the framework reveals properties but doesn't yet prescribe improvements.

4. Timeliness & Relevance

The paper is timely in several respects. Mechanistic interpretability is a rapidly growing field, and extending it beyond language models to spatial reasoning is valuable. The studied architectures (HRM, TRM) are very recent (2025), and the integration with embodied 3D models (MTU3D) addresses the growing interest in grounded spatial reasoning. The connection between recursive computation and spatial locality is relevant to ongoing debates about how compact models solve complex reasoning tasks.

However, the benchmarks remain "toy-yet-challenging" (the paper's own characterization), and the MTU3D extension, while promising, is limited to 30 scenes with primarily negative findings (no causal locality inside the encoder).

5. Strengths & Limitations

Key Strengths:

Conceptual clarity: the local-execution/global-planning dichotomy is formalized into a measurable quantity with well-defined baselines.

The MTU3D dissociation (structural bias ≠ causal locality) is the paper's most compelling and surprising finding—it demonstrates the framework's diagnostic value beyond confirming expected patterns.

Thorough appendix with extensive supplementary analyses, heatmaps, and per-sample diagnostics.

Code availability enables reproduction.

Key Limitations:

The framework is descriptive rather than prescriptive: it measures locality but doesn't demonstrate how to use measurements to improve models.

The "within-H ≥ within-L" finding, while consistent across model-task pairs, has varying effect sizes. For HRM/Sudoku, the gap is tiny (.374 vs .371), raising questions about practical significance.

The paper is dense with measurements but light on mechanistic narrative—it's not always clear *why* H-level writes are local or what computational role this serves.

The cross-architecture comparison is limited by the fundamental differences between HRM (separate modules) and TRM (shared module), making it hard to attribute differences to specific design choices.

Heavy reliance on appendix material for key evidence weakens the main narrative.

Overall Assessment

This is a carefully executed interpretability study that introduces a useful conceptual framework (interaction locality) and provides extensive empirical evidence across multiple tasks and architectures. The core insight—that what appears "global" in architecture labels may be causally local in information flow—is valuable. However, the impact is constrained by the narrow scope of studied models, modest sample sizes, purely diagnostic (rather than prescriptive) nature of findings, and the distance between the current grid-world applications and real-world deployment scenarios. The MTU3D extension is the most forward-looking contribution but yields primarily negative results within the encoder.

Rating:4.8/ 10

Significance 5Rigor 6Novelty 5.5Clarity 5

Generated May 21, 2026

Comparison History (23)

vs. Insights Generator: Systematic Corpus-Level Trace Diagnostics for LLM Agents

gemini-3.15/22/2026

Paper 1 addresses a critical and universal bottleneck in current AI research and deployment: debugging complex LLM agents at scale. By automating corpus-level trace diagnostics, it offers a highly practical, scalable tool that directly improves downstream agent performance (as evidenced by a 30.4pp gain). Its immediate real-world applicability and timeliness give it a broader and more immediate scientific impact compared to Paper 2, which focuses on a more niche, albeit rigorous, mechanistic interpretability framework for spatial reasoning models.

vs. TO-Agents: A Multi-Agent AI Pipeline for Preference-Guided Topology Optimization

gpt-5.25/22/2026

Paper 2 likely has higher scientific impact due to its methodological novelty and breadth: it proposes a general, reproducible measurement framework (interaction locality) for probing information flow in hierarchical/recursive reasoning and extends analysis across multiple benchmarks and an embodied 3D model, increasing cross-field relevance (interpretability, reasoning, robotics/vision). Its rigor (causal interventions via activation patching/feature ablations, auxiliary checks) supports strong, testable claims. Paper 1 is application-rich and timely, but is more of an engineering integration of existing agent/VLM components with limited generalization and modest success rates, likely narrowing impact.

vs. MindLoom: Composing Thought Modes for Frontier-Level Reasoning Data Synthesis

claude-opus-4.65/22/2026

MindLoom addresses a broadly impactful problem—synthesizing high-quality reasoning training data for LLMs—with a practical, open-sourced framework validated across nine benchmarks, multiple model families, and five STEM disciplines. Its compositional thought-mode engineering offers a novel, scalable methodology with immediate real-world applications in LLM training. Paper 2 introduces an interesting analytical framework (interaction locality) for understanding spatial reasoning in neural networks, but its scope is narrower, focused on interpretability of specific model architectures on grid-based tasks, with less immediate broad applicability and adoption potential.

vs. Personality Engineering with AI Agents: A New Methodology for Negotiation Research

claude-opus-4.65/21/2026

Paper 1 introduces 'personality engineering,' a novel interdisciplinary methodology bridging AI and negotiation theory with broad applicability across social sciences, business, and AI agent design. It leverages the well-established interpersonal circumplex as a coordinate system, offering both theoretical contributions and practical implications for designing AI negotiation agents—a timely topic given LLM proliferation. Paper 2, while technically rigorous, addresses a narrower question about information flow locality in specialized reasoning models on grid-based tasks, with more limited audience and applicability beyond the interpretability/mechanistic analysis community.

vs. Beyond Rational Illusion: Behaviorally Realistic Strategic Classification

gemini-3.15/21/2026

Paper 2 introduces a novel interdisciplinary approach by integrating Prospect Theory into strategic classification, addressing a critical flaw in traditional ML models that assume perfect rationality. This bridging of machine learning and behavioral economics offers broader real-world applicability (e.g., in finance, hiring, public policy) compared to Paper 1, which focuses on a highly specific mechanistic interpretability framework for spatial reasoning models. Paper 2's potential to influence both AI fairness/robustness and behavioral science gives it a broader and higher estimated scientific impact.

vs. Playing Devil's Advocate: Off-the-Shelf Persona Vectors Rival Targeted Steering for Sycophancy

gpt-5.25/21/2026

Paper 2 likely has higher impact: it targets an urgent, widely relevant safety/alignment failure mode (sycophancy) in mainstream LLM deployments, proposes a practical mitigation using off-the-shelf persona vectors (no labeled sycophancy data), and reports strong performance with less accuracy tradeoff than CAA. Its finding that sycophancy is not a single steerable direction but relates to broader persona geometry could influence steering, evaluation, and interpretability work across many models and applications. Paper 1 is novel but narrower, focused on specialized spatial/recursive reasoning benchmarks and analysis tooling.

vs. For How Long Should We Be Punching? Learning Action Duration in Fighting Games

claude-opus-4.65/21/2026

Paper 2 introduces a novel measurement framework ('interaction locality') with broader applicability across multiple reasoning domains (mazes, Sudoku, ARC-AGI, 3D embodied reasoning). It provides mechanistic interpretability insights into how hierarchical and recursive models process spatial information, bridging toy benchmarks and real-world embodied AI. Paper 1, while addressing a valid problem in RL for fighting games, presents incremental findings (learned timing matches fixed frame skips but doesn't ensure robustness) in a narrow application domain. Paper 2's framework has greater potential for cross-field impact in interpretability, spatial reasoning, and architecture design.

vs. Library Drift: Diagnosing and Fixing a Silent Failure Mode in Self-Evolving LLM Skill Libraries

gpt-5.25/21/2026

Paper 2 has higher impact potential: it identifies a broadly relevant, practically urgent failure mode in self-improving LLM agent systems (“library drift”), provides a clear causal trigger, introduces actionable diagnostics, and demonstrates a substantial, quantified fix with multiple ablations—strong methodological rigor and immediate real-world applicability. Its governance recipe and instrumentation can generalize across many agent/tooling frameworks, affecting reliability and long-horizon autonomy. Paper 1 offers a useful interpretability measurement framework, but it is narrower in application (spatial/recursive models) and its primary contribution is analytic rather than a widely deployable reliability improvement.

vs. AiraXiv: An AI-Driven Open-Access Platform for Human and AI Scientists

gemini-3.15/21/2026

Paper 1 offers a novel, methodologically rigorous framework for mechanistic interpretability in AI spatial reasoning. By providing a reproducible measurement framework for understanding how models execute local vs. global planning, it tackles a fundamental scientific question in AI. Paper 2, while proposing a timely infrastructural platform for academic publishing, functions more as a community tool than a fundamental scientific advancement.

vs. Declarative Data Services: Structured Agentic Discovery for Composing Data Systems

claude-opus-4.65/21/2026

Paper 2 introduces 'interaction locality,' a novel measurement framework with broader applicability across spatial reasoning, interpretability, and embodied AI. It provides rigorous mechanistic analysis using sparse autoencoders, activation patching, and structural Jacobians across multiple benchmarks (Maze-Hard, Sudoku, ARC-AGI, MTU3D). Its insights into local-to-global information flow in neural networks have wide relevance for interpretability research. Paper 1, while addressing a practical problem in data-system composition, is positioned as an early prototype with limited evaluation ('proof of life'), narrower scope, and less generalizable contributions.

vs. Explainable Wastewater Digital Twins: Adaptive Context-Conditioned Structured Simulators with Self-Falsifying Decision Support

gemini-3.15/21/2026

Paper 2 addresses foundational questions in AI interpretability and spatial reasoning, employing cutting-edge techniques like sparse autoencoders and activation patching. Its insights into hierarchical recursive reasoning models and testing on benchmarks like ARC-AGI offer broad methodological implications for the broader machine learning community. While Paper 1 provides a rigorous, highly valuable application for safety-critical wastewater management, Paper 2's theoretical contributions to understanding neural network reasoning dynamics promise a wider scientific impact across multiple subfields of AI.

vs. \ECUAS{n}: A family of metrics for principled evaluation of uncertainty-augmented systems

claude-opus-4.65/21/2026

Paper 2 addresses a fundamental and broadly applicable problem in evaluating uncertainty-augmented systems across all of AI/ML, proposing a principled metric family grounded in proper scoring rules. This has wide applicability to any high-stakes decision-making system (medical, autonomous driving, NLP, etc.), affecting how the entire community evaluates uncertainty. Paper 1, while interesting, is more niche—focused on measuring interaction locality in specific hierarchical reasoning architectures on grid-based tasks. Paper 2's theoretical grounding, generality, and relevance to the growing need for reliable AI systems give it broader potential impact.

vs. Causal Evidence for Attention Head Imbalance in Modality Conflict Hallucination

gpt-5.25/21/2026

Paper 2 likely has higher impact due to direct relevance to a widely observed, high-stakes failure mode in deployed multimodal LLMs (hallucinations under modality conflict), broad applicability across multiple open-source MLLMs, and a concrete, actionable mitigation (MACI) with strong benchmark gains and transfer. Its causal head-level analysis plus intervention provides a clear mechanistic story and an immediately usable inference-time method. Paper 1 offers a valuable measurement framework for locality in recursive/spatial reasoning, but its applications are narrower and more interpretability-focused with less immediate real-world payoff.

vs. VBFDD-Agent for Electric Vehicle Battery Fault Detection and Diagnosis: Descriptive Text Modeling of Battery Digital Signals

gpt-5.25/21/2026

Paper 2 likely has higher scientific impact due to its novel, general measurement framework (interaction locality) for probing information flow in hierarchical/recursive reasoning, with methodologically rigorous causal analyses (activation patching, feature ablations, checks) across multiple challenging benchmarks and an embodied 3D model. Its contributions are broadly relevant to interpretability, reasoning architectures, and spatial cognition in ML, making it timely and cross-field. Paper 1 is application-valuable for EV battery maintenance and LLM-based decision support, but is more domain-specific and may rely more on engineering integration than a broadly reusable scientific framework.

vs. AgentCo-op: Retrieval-Based Synthesis of Interoperable Multi-Agent Workflows

gemini-3.15/21/2026

Paper 1 presents a highly practical and timely framework for automating and synthesizing multi-agent workflows, with direct, high-impact applications in real-world scientific discovery such as open-world genomics. Its ability to improve interoperability and reduce costs across diverse domains (science, coding, math) gives it broader immediate utility and transformational potential compared to Paper 2, which focuses more narrowly on theoretical interpretability and architectural analysis of spatial reasoning models.

vs. High Quality Embeddings for Horn Logic Reasoning

gpt-5.25/21/2026

Paper 2 is likely higher impact due to a more novel, general measurement framework (“interaction locality”) with broad applicability across architectures (HRM/TRM) and domains (mazes, Sudoku, ARC-AGI, plus a large-scale 3D grounding model). It emphasizes causal interpretability methods (activation patching, feature ablations) and produces cross-task, cross-model insights about local-to-global computation, which can influence both model design and evaluation. Paper 1 improves embedding training for Horn-logic-guided search, useful but narrower in scope and closer to incremental optimization within a specialized reasoning pipeline.

vs. Evaluating Temporal Semantic Caching and Workflow Optimization in Agentic Plan-Execute Pipelines

gpt-5.25/21/2026

Paper 1 introduces a novel, task-geometry-aware interpretability framework (interaction locality) with multiple causal attribution instantiations and applies it across diverse reasoning benchmarks (mazes, Sudoku, ARC-AGI) plus an embodied 3D grounding model, yielding general insights about local-to-global computation in recursive architectures. This is timely for mechanistic interpretability and reasoning-centric model design, with broad relevance across ML subfields. Paper 2 is highly practical and impactful for industrial agent systems, but centers on engineering optimizations (caching/parallelism) whose scientific novelty and cross-field conceptual reach are narrower despite strong real-world applicability.

vs. Mahjax: A GPU-Accelerated Mahjong Simulator for Reinforcement Learning in JAX

gpt-5.25/21/2026

Paper 2 introduces a new, general measurement framework (“interaction locality”) for probing information flow in hierarchical/recursive reasoning, and validates it across multiple reasoning benchmarks (mazes, Sudoku, ARC-AGI) plus an embodied 3D model. This offers broader cross-field impact (interpretability, reasoning architectures, embodied AI) and timely relevance to understanding model mechanisms with causal tools (patching/ablations). Paper 1 is a strong engineering contribution enabling faster RL in Mahjong, but its novelty is more infrastructural and its impact is narrower to game/RL benchmarking.

vs. Mahjax: A GPU-Accelerated Mahjong Simulator for Reinforcement Learning in JAX

gpt-5.25/21/2026

Paper 1 offers a novel, generalizable measurement framework (“interaction locality”) for probing causal information flow in hierarchical/recursive reasoning and extends it across multiple challenging reasoning benchmarks plus an embodied 3D model, suggesting broader cross-domain impact and timely relevance to mechanistic interpretability and reasoning architectures. Its methodological toolkit (feature ablations, activation patching, Jacobian/attention checks) supports rigor. Paper 2 is valuable infrastructure with clear practical utility for RL research, but its core contribution is an optimized simulator/benchmark (narrower conceptual novelty and field breadth) compared to Paper 1’s potentially foundational analysis framework.

vs. Position: The Turing-Completeness of Real-World Autoregressive Transformers Relies Heavily on Context Management

gpt-5.25/21/2026

Paper 2 introduces a new, operational measurement framework (“interaction locality”) with concrete instantiations (feature ablations, activation patching, Jacobian/attention checks) and validates it across multiple challenging benchmarks and an embodied 3D model, suggesting broad applicability and near-term utility for mechanistic interpretability and model design. Paper 1 provides important conceptual clarification about Turing-completeness claims and context management, but it is largely a position/theory reframing with less direct methodological or empirical leverage. Overall, Paper 2 is more actionable, cross-domain, and timely for current interpretability and reasoning research.