Geometry of Human Perceptual Domains Emerges Transiently in LLM Representations
Simardeep Singh, Paras Chopra
Abstract
While large language models (LLMs) are trained purely on textual data, prior work has shown that their internal representations can exhibit rich geometric structure in embedding space. Building on this line of work, we investigate whether such structure is similar to human perceptual organisation across different domains (e.g., color, pitch, emotion, and taste). Specifically, we study the layer-wise emergence of intrinsic geometrical structure corresponding to perceptual modalities within the residual streams of multiple open-weight transformer architectures. Our results reveal three key findings. First, we observe the emergence of layer-wise geometric structure across multiple perceptual domains, despite the absence of any direct perceptual supervision during training. Second, these perceptual domains exhibit distinct emergence profiles, with both geometric structure and its alignment with human baselines following domain- and model-specific trajectories across depth. Third, this emergence follows a consistent representational trajectory: geometry is weak or diffuse in early layers, becomes progressively organised in intermediate layers, and is attenuated in later layers, suggesting that perceptual geometry arises transiently as part of the model's internal transformation pipeline. This provides new insight into how and where human-like perceptual geometry arises in LLMs, offering a principled pathway for mechanistic analysis of internal representations.
AI Impact Assessments
(1 models)Scientific Impact Assessment
Core Contribution
This paper investigates whether the internal representations of large language models (LLMs) encode geometric structures that align with human perceptual organization across four sensory/affective domains: color, pitch, emotion, and taste. The central finding is that these perceptual geometries emerge transiently across model depth—weak in early layers, peaking in intermediate layers, and attenuating in later layers—despite the models receiving no perceptual supervision. The paper additionally documents that different perceptual domains exhibit distinct emergence profiles (e.g., taste peaks earlier and degrades faster; emotion is more persistent).
The work extends prior observations (Marjieh et al., 2023; Abdou et al., 2021; Engels et al., 2025) by moving from output-level or single-point analyses to a systematic layer-wise characterization across multiple models and modalities. This shift from "do LLMs encode perceptual structure?" to "where and when does it emerge across depth?" is a meaningful conceptual advance.
Methodological Rigor
The methodology is straightforward and transparent: extract last-token hidden states at each layer for minimal prompts, compute cosine dissimilarity matrices, project via MDS, and compare to human baselines using RSA and GPA. The approach is fully intrinsic (no probing classifiers), which is both a strength (avoiding confounds from trained probes) and a limitation (provides no causal or mechanistic insight).
Strengths in rigor:
Weaknesses in rigor:
Potential Impact
The paper contributes to a growing literature connecting LLM representations to cognitive and perceptual structures. Its potential impact lies in several directions:
1. Mechanistic interpretability: The layer-wise profiling approach could motivate targeted mechanistic studies (e.g., circuit-level analysis of which attention heads or MLPs contribute to perceptual geometry at peak layers).
2. Cognitive science/AI alignment: If LLMs genuinely encode human-like perceptual geometry from text alone, this has implications for theories of grounding and embodiment—suggesting that statistical regularities in language may be sufficient to recover aspects of perceptual structure.
3. Practical applications: Understanding where perceptual structure peaks could inform layer selection for downstream tasks involving sensory or affective reasoning.
However, the impact is somewhat constrained by the descriptive nature of the findings. The paper identifies *that* and *where* perceptual geometry emerges but does not explain *why* or *how*, limiting its mechanistic value.
Timeliness & Relevance
This work is timely. There is significant current interest in (a) understanding the internal representations of LLMs (mechanistic interpretability), (b) geometric structure in neural networks (representation topology), and (c) the grounding problem—whether text-only models can develop perceptual-like representations. The paper sits at the intersection of these active research threads.
The transient emergence finding connects to broader observations about the "layer lifecycle" of information in transformers (lexical → semantic → task-specific), providing a perceptual analogue to this narrative.
Strengths
Limitations
Overall Assessment
This is a competent empirical study that makes an interesting observational contribution to the growing literature on geometric structure in LLM representations. The layer-wise perspective and multi-domain scope are genuine advances. However, the work is primarily descriptive, lacks critical controls, and does not sufficiently interrogate alternative explanations for the observed alignment. The findings are suggestive rather than definitive, and the paper would benefit substantially from null baselines, prompt ablations, and formal statistical testing of its central claims.
Generated May 28, 2026
Comparison History (16)
Paper 2 likely has higher impact due to stronger novelty and broader cross-disciplinary relevance: it provides empirical evidence that human-like perceptual geometries (color, pitch, emotion, taste) emerge transiently across layers in multiple transformer models, offering a concrete, testable phenomenon for mechanistic interpretability and cognitive/computational neuroscience. Its findings can inform representation analysis, model comparison, and potentially multimodal alignment. Paper 1 is valuable as a unifying taxonomy and design-pattern synthesis, but is primarily conceptual/organizational with less direct new empirical or algorithmic contribution, making its near-term scientific impact comparatively narrower.
Paper 1 investigates fundamental questions about the nature of LLM representations and their connection to human cognition, offering novel insights into mechanistic interpretability. This theoretical depth provides broad, interdisciplinary scientific value across AI and cognitive science. While Paper 2 offers a valuable and rigorous engineering solution for cost-efficiency, Paper 1's discoveries about intrinsic geometric structures have a higher potential for foundational scientific impact and future theoretical breakthroughs.
Paper 2 addresses a critical and timely security vulnerability in LLM agents—persistent cross-interaction sleeper attacks—which has immediate practical implications for AI safety as LLM agents are increasingly deployed. It introduces a novel threat formalization, a comprehensive benchmark (1,896 instances), and evaluates across seven LLMs, providing actionable insights for the safety community. Paper 1, while intellectually interesting in studying perceptual geometry in LLMs, is more observational and narrower in its impact scope, primarily contributing to interpretability research without clear downstream applications.
Paper 2 likely has higher scientific impact due to stronger novelty and broader cross-disciplinary relevance: it connects LLM internal geometry to multiple human perceptual domains and provides layer-wise, mechanistic insights applicable to interpretability, cognitive science, and representation learning. Its findings (transient emergence profiles across models/domains) can inform theory and future analyses beyond NLP benchmarks. Paper 1 is a useful, timely optimization to MLM pretraining (entropy-based masking/self-masking) with practical gains, but it is closer to incremental objective engineering with narrower breadth and potentially faster commoditization.
Paper 2 addresses a fundamental scientific question regarding how human perceptual geometry emerges in LLMs from text alone. Its insights into mechanistic interpretability and AI cognition offer broader interdisciplinary impact across ML theory, cognitive science, and AI alignment. While Paper 1 provides a valuable practical framework for resource-constrained agents, Paper 2's theoretical contributions are likely to inspire a wider range of foundational research and long-term citations.
Paper 2 has higher potential scientific impact due to greater novelty and cross-disciplinary breadth: it links emergent LLM representation geometry to human perceptual organization across multiple modalities, offering mechanistic insight relevant to cognitive science, neuroscience, interpretability, and representation learning. Its findings (transient, layer-wise emergence/attenuation profiles) can generalize across models and inspire new analysis tools. Paper 1 is timely and useful for practitioners (efficiency benchmarking), but is primarily an engineering/evaluation framework with more incremental conceptual contribution and narrower scientific reach.
Paper 2 (Ratchet) addresses a practical bottleneck in self-evolving LLM agents with a concrete, actionable solution showing substantial empirical gains (+32.8pp on MBPP+, transferable to SWE-bench). It offers immediate real-world applicability for improving LLM agent systems, includes rigorous ablations identifying minimal working components, and provides theoretical guarantees. Paper 1 offers interesting cognitive science insights about perceptual geometry in LLMs but is primarily observational/analytical with narrower immediate applications. Ratchet's practical impact on the rapidly growing LLM agent ecosystem gives it broader and more timely significance.
Paper 2 addresses a critical bottleneck in LLM development by proposing a scalable RL framework that improves reasoning without relying on stronger teacher models. Given the current focus on scaling reasoning capabilities (e.g., test-time compute and RL), this methodology offers high practical utility and broad impact across AI development. Paper 1 offers valuable theoretical insights into LLM interpretability and cognitive alignment, but Paper 2's direct application to enhancing model performance makes its potential real-world impact significantly higher.
Paper 2 addresses a fundamental question about how LLMs develop human-like perceptual representations despite lacking sensory input, with broad implications across cognitive science, AI interpretability, and neuroscience. Its findings about transient geometric structure emerging in intermediate layers provide novel mechanistic insights into transformer representations. While Paper 1 makes solid contributions to enzyme-reaction retrieval with practical bioinformatics applications, Paper 2's cross-disciplinary relevance (linguistics, cognitive science, AI alignment, mechanistic interpretability) and its surprising findings about grounded cognition emerging from text-only training give it higher potential for broad scientific impact.
Paper 1 addresses a fundamental scientific question about the relationship between language model representations and human perceptual organization, revealing that perceptual geometry emerges transiently in intermediate layers despite no perceptual training. This offers deep insights into both AI and cognitive science, with broad interdisciplinary impact. Paper 2, while technically solid with strong empirical results, is primarily an engineering contribution to prompt optimization—an area with many competing methods and rapid obsolescence. Paper 1's findings about emergent perceptual structure have longer-lasting scientific significance and wider relevance across fields.
Paper 2 offers fundamental scientific insights into the internal representations of LLMs, bridging artificial intelligence with human cognitive science. By revealing how and where perceptual geometry emerges, it significantly advances mechanistic interpretability. While Paper 1 provides a strong, practical engineering solution for multi-agent systems, Paper 2 has a broader scientific impact by addressing the 'black box' nature of neural networks and exploring foundational questions about how language models encode worldly concepts without direct sensory grounding.
Paper 1 is likely to have higher impact because it introduces a concrete, actionable evaluation problem for RAG (citation laundering) plus a benchmark (FORCEBENCH) with an operational metric (monotonicity violation rate) and a released pipeline—making it immediately usable for improving deployed systems. The work is timely given widespread RAG adoption, has clear real-world applications in reliability/safety, and can influence evaluation practice across NLP/IR. Paper 2 is novel and cross-disciplinary for interpretability/cognitive alignment, but its applications are more indirect and may have slower translational uptake.
Paper 2 provides fundamental scientific insights into how human-like perceptual representations emerge in text-only models, bridging AI interpretability and cognitive science. While Paper 1 offers a valuable benchmark for agent development, Paper 2's findings on the transient nature of perceptual geometry in neural representations offer deeper theoretical implications for understanding LLM cognition and cross-modal learning, giving it broader interdisciplinary impact.
Paper 2 has higher likely impact due to timeliness and direct deployment relevance: it identifies a concrete, under-measured safety failure mode (brittle safety), proposes a general evaluation protocol (context-flips), tests broadly across 12 models with controls, and connects findings to actionable mitigation (state-aware validation) with released benchmarks/probes. This combination of methodological contribution, real-world applicability, and breadth across aligned LMs and safety engineering suggests wider uptake than Paper 1’s primarily interpretability-focused insight, which is novel but more exploratory and less immediately actionable.
Paper 1 offers a more novel and broadly impactful contribution by revealing that human perceptual geometry transiently emerges in LLM representations despite purely textual training. This finding has deep implications for cognitive science, AI interpretability, and our understanding of how language encodes perceptual knowledge—spanning multiple fields. Paper 2, while practically useful for clinical NLP, is more incremental, improving existing RAG-RL methods with a domain-specific reward engineering approach. Paper 1's fundamental insight about the relationship between language models and human perception is likely to inspire more diverse follow-up research.
Paper 1 is more novel and broadly impactful: it probes emergent, human-aligned perceptual geometry across multiple modalities and models, offering a general mechanistic interpretability framework with implications for cognitive science, representation learning, and model evaluation. Its findings (transient layer-wise emergence/attenuation) suggest new hypotheses about transformer computation and could influence analysis tools and downstream alignment work. Paper 2 is methodologically careful but narrower—an audit of a specific decoding/accounting mechanism with limited cross-field reach and more incremental practical implications.