Jie Zhao, Xianqi Dai, Jie Feng, Huandong Wang, Yong Li
Dynamic origin-destination (OD) flow generation seeks to synthesize realistic mobility dynamics from temporal context alone, without relying on historical OD observations. A key challenge is to translate semantic temporal signals into temporally coherent OD patterns while preserving the inherent spatial heterogeneity of urban regions. We propose DynaOD, a semantic-driven framework that models temporal dynamics through two complementary perspectives: discrete directional trends that characterize qualitative shifts in urban activity patterns, and continuous temporal evolution that captures how such shifts unfold over time. By jointly encoding these temporal semantics, the framework constructs time-varying region representations that condition pretrained static OD generators in a lightweight and plug-and-play fashion. This modular design further supports scalable deployment and cross-city transferability. Extensive experiments on large-scale real-world datasets show that our method consistently outperforms representative baselines in both predictive accuracy and distributional fidelity. Code is publicly available at https://github.com/csjiezhao/DynaOD.
DynaOD addresses the problem of dynamic origin-destination (OD) flow generation — synthesizing realistic, time-varying mobility matrices from temporal context alone, without requiring historical OD observations. This is a meaningful distinction from OD *prediction* (which extrapolates from past OD data) and static OD *generation* (which produces time-aggregated matrices).
The core novelty lies in a two-stage temporal semantic modeling approach: (1) discrete directional controls inferred by LLMs that capture qualitative trends (increase/stable/decrease) in urban attributes under given temporal contexts, and (2) ShapeNet, a differentiable shape generator that converts these discrete signals into continuous, smooth temporal feature dynamics. These time-varying features then modulate pretrained static OD generators in a plug-and-play fashion. The framework further includes a retrieval-based shape memory (ShapeMem) for cross-city generalization and an LLM distillation strategy for efficient deployment.
Strengths in design: The decomposition of temporal semantics into discrete directional signals and continuous evolution is well-motivated and cleanly executed. The use of LLMs for commonsense-driven directional inference is a creative choice that avoids the need for explicit temporal supervision. The multiplicative modulation scheme (Eq. 2) is simple yet effective, and keeping the OD generator frozen enables modularity.
Experimental concerns: The evaluation is conducted on a single dataset (U.S. county-level mobility from January 2019), which, while large-scale (500 counties), represents a limited geographic and temporal scope. The temporal window is only one month, raising questions about how well the framework handles seasonal variation or longer-term dynamics. The strict unseen-city/unseen-date split protocol is commendable, but the paper does not test on truly different countries or fundamentally different urban morphologies.
The baselines are reasonable but somewhat dated — gravity models, random forests, SVR, and GBRT are classical methods. The deep learning baselines are limited to DGM and the generative models NetGAN and WeDAN. More recent spatio-temporal forecasting models (even if adapted) would strengthen the comparison. The authors do acknowledge that baselines are augmented with temporal features for fair comparison, but this augmentation may not fully represent what modern temporal models could achieve.
Statistical reporting: The paper lacks confidence intervals, variance measures, or significance tests. With only a single dataset split, it's difficult to assess robustness of the reported improvements.
The work sits at the intersection of urban computing, generative modeling, and LLM applications. The practical impact is potentially significant:
However, the impact is somewhat constrained by the reliance on census-tract-level U.S. data. Applicability to different spatial granularities, non-U.S. cities, or real-time applications is not demonstrated. The daily temporal granularity is also a limitation acknowledged by the authors — many transportation applications require hourly or sub-hourly OD flows.
The paper is timely in several respects:
The work responds to a real gap: most OD generation work is static, while real-world mobility is inherently dynamic. The formulation of context-conditioned generation (rather than history-driven prediction) is a useful conceptual contribution.
The case study (Figure 6) is illustrative but limited to a single tract over 7 days. More systematic visualization of temporal dynamics across functionally diverse regions would strengthen interpretability claims. The 34.7% CPC improvement is substantial, but the absolute CPC of 0.492 suggests there remains significant room for improvement in the task overall.
The paper is generally well-written and clearly structured, though the methodology section could benefit from more formal specification of the LLM prompting strategy and the ShapeNet architecture details.
Generated Jun 9, 2026
Paper 1 offers a more novel methodological contribution: a discrete-to-continuous temporal semantic modeling framework that conditions pretrained OD generators in a modular, plug-and-play way, enabling scalable deployment and cross-city transfer. Its applications (urban planning, transportation, mobility simulation) are broad and societally relevant, and the approach may generalize to other spatiotemporal generative tasks. Paper 2 is a solid applied study but primarily combines established techniques (instruction tuning, LoRA, NEFTune) for a narrow domain/dataset, making its incremental scientific novelty and cross-field impact comparatively lower.
Paper 1 tackles urban mobility and dynamic origin-destination flow generation, offering a scalable framework with cross-city transferability. Its applications in urban planning and traffic management have broad, high-stakes societal impact. While Paper 2 presents an highly innovative cross-disciplinary approach (adapting autonomous driving models to sports analytics), its primary domain is limited to football. Paper 1's generalizable methodology for spatial-temporal data and its wider implications for smart city infrastructure give it a higher potential for broad scientific and real-world impact.
Paper 2 addresses a fundamental scalability bottleneck in neurosymbolic AI (NeurASP), a growing field at the intersection of neural networks and symbolic reasoning. Achieving orders-of-magnitude speedups enables previously intractable tasks, broadening the applicability of neurosymbolic methods. This has wider cross-field impact (AI, logic programming, explainable AI) and addresses a core limitation that many researchers face. Paper 1, while technically sound, addresses a more niche problem (OD flow generation) with narrower applicability primarily in urban computing/transportation.
Paper 2 addresses the critical challenge of dynamic OD flow generation without historical data, overcoming major data scarcity and privacy barriers in urban mobility. Its cross-city transferability and plug-and-play design offer broader impact across urban planning and transportation modeling compared to Paper 1's efficiency-focused data interpolation approach.
Paper 2 is more likely to have higher scientific impact due to timeliness and broader cross-field relevance: it targets standardization and interoperability for “Agentic AI” via a declarative protocol approach, and demonstrates integration with an industry-backed standard (Google-led UCP), increasing adoption potential. Its contributions (formal specification + working interop) can influence multiagent systems, programming languages, and applied AI engineering. Paper 1 is technically solid and useful for urban computing, but its impact is more domain-specific and incremental (conditioning pretrained OD generators) relative to a standards-aligned, broadly applicable protocol framework.
Paper 1 addresses a fundamental, timely challenge in education—how to teach and assess productive AI reasoning skills—proposing a novel competency model (CoRe-3) with theoretical grounding and testable propositions. Its breadth of impact spans education, AI literacy, assessment design, and cognitive science, affecting millions of students and educators. Paper 2 makes a solid technical contribution to urban mobility modeling but addresses a narrower domain. The timeliness and cross-disciplinary relevance of Paper 1, given the rapid adoption of generative AI in education, gives it substantially higher potential impact.
While Paper 1 offers strong practical applications for urban mobility, Paper 2 (AFSAT) addresses a foundational and broadly applicable problem: Boolean Satisfiability (SAT). SAT solvers are critical tools across computer science, hardware verification, and AI planning. By successfully engineering a GPU-accelerated solver using continuous local search, Fourier transforms, and modern frameworks (JAX), AFSAT introduces significant methodological innovation. This ability to massively parallelize SAT solving gives it a much broader potential scientific impact across multiple disciplines compared to the domain-specific focus of Paper 1.
Paper 1 addresses a critical and highly timely challenge in the rapidly expanding field of autonomous AI agents: robust behavioral evaluation. By proposing a generalizable entropy-based framework, its impact spans across multiple domains of AI research and development. Paper 2, while methodologically rigorous and valuable for urban computing, focuses on a much narrower subfield (spatio-temporal mobility modeling). Thus, Paper 1 has a significantly broader potential scientific impact and relevance to the wider AI community.
Paper 2 addresses a broader and more impactful problem—enabling generalizable reasoning in medical AI agents through self-evolving skill memory. Its contributions span multiple fields (AI, healthcare, agent systems) and tackle fundamental challenges in continual learning without weight updates. The skill-based memory framework with closed-loop governance is more novel and generalizable than Paper 1's domain-specific OD flow generation. Medical AI applications have enormous real-world impact potential, and the framework's backbone-agnostic, transferable design enhances its breadth of influence across the research community.
Paper 2 is more novel and broadly impactful: it introduces a general, lightweight adapter layer that grounds coding agents to operate complex scientific simulators, with self-evolution and demonstrated transfer across multiple major simulators (GEOS, OpenFOAM, LAMMPS). The real-world application potential is high (large productivity gains for domain scientists) and timely given rapid adoption of AI agents in scientific workflows. Paper 1 is solid and practical for urban mobility modeling but is narrower in domain scope and impact breadth than a general agent-to-simulator interface framework.