LACO: Adaptive Latent Communication for Collaborative Driving
Tianhao Chen, Yuheng Wu, Dongman Lee
Abstract
Collaborative driving aims to improve safety and efficiency by enabling connected vehicles to coordinate under partial observability. Recent approaches have evolved from sharing visual features for perception to exchanging language-based reasoning through foundation models for behavioral coordination. Though communicating in language provides intuitive information, it introduces two challenges: high latency caused by autoregressive decoding and information loss caused by compressing rich internal representations into discrete tokens. To address these challenges, we analyze latent communication in collaborative driving under inherent limitations of multi-agent settings. Our analysis reveals agent identity confusion, where direct fusion of latent states entangles decision representations across vehicles. Motivated by this, we propose LACO, a training-free \textbf{LA}tent \textbf{CO}mmunication paradigm that seamlessly adapts pretrained driving models to collaborative settings. LACO introduces Iterative Latent Deliberation (ILD) for latent reasoning, Cross-Horizon Saliency Attribution (CHSA) for communication-efficient information selection, and Structured Semantic Knowledge Distillation (SSKD) to stabilize ego-centric decision making. Closed-loop experiments in CARLA show that LACO notably reduces communication and inference latency while maintaining strong collaborative driving performance.
AI Impact Assessments
(1 models)Scientific Impact Assessment: LACO: Adaptive Latent Communication for Collaborative Driving
1. Core Contribution
LACO introduces a paradigm shift in collaborative driving communication by replacing language-based or visual-token-based inter-vehicle communication with selective KV cache exchange from transformer-based Vision-Language-Action (VLA) models. The paper identifies a critical phenomenon—agent identity confusion—where naive full-depth KV cache fusion causes the ego vehicle to over-attend to a collaborator's deep-layer representations, effectively hijacking its control policy. This is a genuine and well-characterized failure mode that had not been previously documented in the collaborative driving literature.
The framework comprises three components: (1) Iterative Latent Deliberation (ILD), which performs latent reasoning through iterative forward passes without autoregressive token generation; (2) Cross-Horizon Saliency Attribution (CHSA), which prunes spatially redundant tokens based on attention-derived saliency scores; and (3) Structured Semantic Knowledge Distillation (SSKD), which restricts communication to shallow transformer layers where representations remain globally informative but not yet entangled with ego-specific control synthesis.
The training-free nature is a significant practical advantage—it allows direct deployment on pretrained VLA models without fine-tuning, which is critical given the cost of training large driving models.
2. Methodological Rigor
The motivation study is the paper's strongest methodological contribution. The attention entropy analysis revealing a U-shaped trajectory across layers (global parsing → ego-centric contraction → control synthesis resurgence) provides principled justification for shallow-layer-only fusion. The quantitative characterization of spatial attention sparsity (~30% of tokens capturing most attention mass) similarly grounds the CHSA design.
However, several concerns arise:
The ablation studies are reasonably thorough, covering each component, distillation depth, retention rate, and latent step count. The finding that 10% depth and 30% retention rate are broadly optimal across architectures provides useful practical guidance.
3. Potential Impact
Autonomous driving: If latent communication proves robust in practice, it could fundamentally change V2V communication protocols. The 20× latency reduction over language-based methods and 40-90% bandwidth reduction over visual sharing are compelling for real-time safety-critical applications.
Multi-agent AI systems: The agent identity confusion phenomenon and the shallow-vs-deep analysis have broader implications for any multi-agent system built on transformer architectures. The insight that deep representations become identity-entangled could inform design choices in multi-robot coordination, distributed AI systems, and federated inference.
Practical deployment: The training-free property significantly lowers the barrier to adoption—fleet operators could potentially upgrade existing single-vehicle VLA systems to collaborative ones without retraining.
However, the paper doesn't address several practical concerns: heterogeneous model architectures between vehicles, privacy implications of sharing internal representations, adversarial robustness of KV cache communication, and regulatory considerations.
4. Timeliness & Relevance
The paper is highly timely. VLA models for driving are rapidly maturing (ORION, SimLingo, LMDrive all from 2024-2025), and the question of how to enable collaboration among these models is nascent. Language-based approaches like LangCoop appeared very recently (2025), making LACO's critique of language communication overhead and its latent alternative immediately relevant.
The connection to concurrent work on latent-space communication in LLMs (KV cache sharing, latent reasoning) positions this work at the intersection of two active research fronts.
5. Strengths & Limitations
Key Strengths:
Notable Limitations:
Overall Assessment
LACO presents a well-motivated and practically significant contribution to collaborative autonomous driving. The identification and analysis of agent identity confusion is the paper's most important intellectual contribution, while the complete framework demonstrates strong empirical results with meaningful efficiency gains. The work would benefit from larger-scale experiments, real-world considerations, and formal analysis, but it represents a solid first step toward latent-space V2V communication for driving.
Generated May 22, 2026
Comparison History (18)
TerminalWorld introduces a novel scalable benchmarking methodology that addresses a significant gap in evaluating AI agents on real-world terminal tasks. Its automated data engine processing 80K+ recordings, coverage of 1,280 unique commands across 18 categories, and demonstration that current frontier models achieve only 62.5% pass rate provide high-impact insights for the rapidly growing AI agent community. The benchmark fills a distinct niche (weak correlation with existing benchmarks) and is designed to scale with evolving practices. While LACO presents solid technical contributions to collaborative driving, its scope is narrower, constrained to a specific simulation environment, and builds incrementally on existing paradigms.
Paper 2 has higher potential scientific impact due to its novel, technical contribution (training-free latent communication with ILD/CHSA/SSKD) targeting a high-stakes real-world domain (collaborative autonomous driving), with clear claims on reducing latency and improving coordination validated in closed-loop CARLA experiments. Its ideas could generalize to multi-agent robotics and V2X communication. Paper 1 is timely and valuable for AI education/benchmarking, but its impact is more niche (pedagogy + humanities/social-science QA), with less methodological depth and narrower downstream technological adoption potential.
Paper 1 addresses AI safety and LLM jailbreaking, a critically important field with broad implications for the safe deployment of foundation models. Its novel theoretical framework for understanding refusal suppression provides fundamental insights that could significantly influence future alignment strategies. While Paper 2 offers strong contributions to collaborative autonomous driving, the widespread use of LLMs and the urgent need for robust safety mechanisms give Paper 1 a broader and more immediate scientific impact.
Paper 1 is more methodologically innovative and timely for autonomous systems: it proposes a concrete, training-free latent-communication framework (ILD, CHSA, SSKD) addressing clear bottlenecks (latency, information loss, identity confusion) and validates in closed-loop CARLA, implying near-term deployment relevance for connected AVs and multi-agent robotics. Its ideas may generalize to other multi-agent settings (robot swarms, decentralized inference), broadening impact. Paper 2 is important and applicable to education/policy, but likely less novel methodologically and more context-dependent; rigor is hard to judge from abstract alone.
LACO addresses a timely and high-impact problem at the intersection of autonomous driving, multi-agent systems, and foundation models. It proposes a practical, training-free framework with clear real-world applications in connected autonomous vehicles, validated through closed-loop experiments. The work bridges latent communication, collaborative perception, and efficient inference—topics of broad interest across AI, robotics, and transportation. Paper 1, while theoretically solid, addresses a more niche topic in logic programming/ASP modularity with a narrower audience and fewer immediate real-world applications.
LACO addresses a timely and high-impact problem at the intersection of autonomous driving, multi-agent systems, and foundation models. It introduces novel concepts (agent identity confusion, training-free latent communication) with broader implications for multi-agent AI beyond driving. The work bridges foundation models with collaborative robotics, a rapidly growing field. Paper 1, while solid, applies existing DRL techniques (PPO, MLPs) to a well-studied scheduling problem with incremental improvements over dispatching rules, offering less novelty and narrower impact potential.
Paper 1 addresses a broader, more fundamental challenge in survey research methodology with a comprehensive five-stage framework applicable across many disciplines. It introduces novel methodological contributions (A-TLM, theory-constrained knowledge graphs, subgroup-stratified bias auditing) with rigorous evaluation against established baselines. Its impact spans social sciences, disaster management, and AI methodology. Paper 2, while technically solid, addresses a narrower problem in collaborative autonomous driving with incremental improvements. Paper 1's methodological contributions and cross-disciplinary relevance give it higher potential impact.
Paper 1 presents a massive-scale foundation model trained on data from 5 million participants, addressing a critical bottleneck in personalized healthcare. Its broad applicability across 35 health prediction tasks, combined with real-world clinical validation, suggests a profound impact on health tech and medical AI. While Paper 2 offers valuable methodological improvements for collaborative driving, its evaluation is limited to a simulator (CARLA), making Paper 1's real-world implications, unprecedented scale, and interdisciplinary reach significantly more impactful.
Paper 1 is likely higher impact due to its cross-domain benchmark contribution with explicit baselines, ablations/null controls, frozen evaluation, and provenance tooling—an infrastructure/result type that can shape evaluation practices broadly across scientific AI. It also offers generalizable insights (when coordination helps vs. not) applicable to many agentic systems beyond any single domain. Paper 2 is timely and practically relevant for connected autonomous driving, but its impact is narrower (domain-specific) and the abstract provides fewer details on methodological rigor and generalizability beyond CARLA experiments.
Paper 2 likely has higher impact due to broader cross-field relevance (generalizable evidence-grounded literature mapping + audited LLM hypothesis generation) and immediate real-world utility for accelerating discovery workflows beyond nanomedicine. Its evaluation includes retrospective benchmarks and human assessment, supporting methodological rigor for a decision-support tool. Paper 1 is timely and technically novel for multi-agent autonomous driving, but its impact is narrower (collaborative driving stacks/CARLA) and depends more on deployment constraints and safety validation. Overall, Paper 2’s platform-like applicability suggests larger scientific and practical reach.
Paper 2 addresses critical bottlenecks (latency and information loss) in collaborative autonomous driving, a field with massive real-world safety and efficiency implications. Its proposed training-free latent communication paradigm demonstrates strong methodological rigor through closed-loop experiments. In contrast, Paper 1 relies on a very small case study (a single speech with 51 segments) to evaluate political emotion analysis, limiting its methodological generalizability and breadth of impact compared to the foundational multi-agent communication advancements in Paper 2.
LACO addresses a fundamental challenge in collaborative autonomous driving—efficient multi-agent communication—with a novel training-free framework that combines latent communication, saliency-based information selection, and knowledge distillation. It has broader impact potential across autonomous systems, multi-agent AI, and robotics. Paper 1 is a narrow case study on a single political speech with limited generalizability, comparing existing models without introducing new methods. Paper 2 offers stronger methodological contributions, clearer real-world applications in autonomous driving safety, and greater relevance to the rapidly growing field of multi-agent coordination.
LACO presents a novel, concrete technical framework addressing fundamental challenges in collaborative autonomous driving—a rapidly growing field with massive real-world applications. It introduces three specific technical innovations (ILD, CHSA, SSKD) validated through closed-loop experiments, offering immediate practical impact. Paper 1 presents a vision/prototype for AOP-Wiki data modernization that, while valuable for toxicology, is more incremental (third iteration), narrower in audience, and relies on future implementation rather than demonstrated results. Paper 2's contributions to multi-agent AI communication have broader cross-disciplinary relevance.
Paper 2 addresses a fundamental inefficiency (idle time during tool execution) ubiquitous in LLM-based agentic workflows, offering broad applicability across numerous domains. While Paper 1 presents an innovative approach to collaborative driving, its impact is largely confined to autonomous vehicles. The generalizability of IdleSpec to various complex, long-horizon tasks and its significant performance gains on standard AI agent benchmarks give it a higher potential for widespread scientific and practical impact in the rapidly growing field of foundation model agents.
LACO addresses a fundamental challenge in collaborative autonomous driving—efficient latent communication between connected vehicles—with a novel training-free paradigm that reduces latency and bandwidth while maintaining performance. It tackles core technical problems (agent identity confusion, communication efficiency) with principled solutions validated in closed-loop simulation. Paper 2, while practical, primarily orchestrates existing LLM agents for visualization generation, representing incremental engineering over rapidly commoditizing AI coding capabilities. LACO's contributions to multi-agent communication and autonomous driving have broader and more lasting scientific impact across robotics, multi-agent systems, and transportation.
Paper 1 likely has higher impact: it introduces a broadly applicable test-time scaling framework (skill evolution from rollout traces + verifier-guided dense feedback) that addresses a key bottleneck in long-context, verifiable coding/EDA agents without fine-tuning or weight updates. The method is tightly tied to rigorous pass/fail verification, shows breakthroughs on previously unsolved tasks, and could generalize to many “verifier-in-the-loop” domains beyond hardware. Paper 2 is timely and useful for collaborative driving, but its impact is narrower and may depend more on simulation-to-reality transfer.
Paper 1 addresses a fundamental and broadly applicable problem—systematic diagnosis of LLM agent failures at scale—which is highly relevant given the rapid deployment of LLM agents across industries. It formalizes a new problem (corpus-level trace diagnostics), introduces a principled multi-agent architecture, and demonstrates strong empirical results (30.4pp improvement). The breadth of impact is larger since it applies to any LLM agent system, not just autonomous driving. Paper 2, while technically sound, addresses a narrower domain (collaborative driving) with incremental advances over existing communication paradigms.
LACO addresses a timely and high-impact problem at the intersection of collaborative autonomous driving, foundation models, and multi-agent communication. Its training-free paradigm for adapting pretrained driving models to collaborative settings is highly novel, tackling fundamental challenges (latency, information loss) in V2V communication. The breadth of impact spans autonomous driving, multi-agent systems, and efficient communication. Paper 2, while solid with strong VRPTW results, addresses a more established optimization problem with incremental improvements. LACO's novelty in latent communication and its relevance to the rapidly growing autonomous driving field give it higher potential impact.