First-Order Trajectory Matching: Fast Ensemble Predictions of Chaotic, Turbulent, Stochastic Systems

Shreya Jha, Timo Schorlepp, Nicholas Geissler, Jules Berman, Benjamin Peherstorfer

Jun 9, 2026arXiv:2606.11138v1

cs.LGmath.NA

#595of 5669·cs.LG

#595 of 5669 · cs.LG

Tournament Score

1497±45

10501750

78%

Win Rate

Wins

Losses

Matches

Rating

7.8/ 10

Significance8

Rigor7.5

Novelty8

Clarity8.5

Abstract

We introduce First-Order Trajectory Matching (FTM), a surrogate-modeling method that learns the first-order local transport of probability mass from trajectories of stochastic systems. By matching the symmetric first-order motion of trajectories, FTM learns the probability current velocity, whose flow preserves time marginals to match ensemble averages, while also capturing current-like trajectory quantities such as fluxes, circulations, and barrier-crossing currents. FTM learns the current velocity directly from trajectories, avoiding drift, diffusion, and score estimation. Our stability analysis separates discretization error from sampling variance and shows that the one-step simulation-free FTM loss is stable when temporal resolution and sample size are properly balanced. Across stochastic dynamical systems and PDE examples, we empirically demonstrate that FTM provides trajectory-aware ensemble predictions at low, deterministic-rollout cost.

AI Impact Assessments

(1 models)

Scientific Impact Assessment: First-Order Trajectory Matching (FTM)

1. Core Contribution

FTM introduces a surrogate modeling framework that learns the probability current velocity directly from trajectory data of stochastic systems. The key insight is that by matching the symmetric first-order motion of trajectories (combining forward and backward increments), the method learns a deterministic ODE velocity field that simultaneously (a) preserves time marginals of the stochastic process and (b) captures trajectory-induced transport quantities like fluxes, circulations, and barrier-crossing currents.

The central innovation is the identification of the probability current velocity—a classical object in stochastic mechanics—as the ideal learning target for deterministic surrogates of stochastic systems. This is neither the drift (which causes mean collapse in operator learning) nor an arbitrary marginal-matching flow (which loses trajectory information), but a specific velocity that inherits both global distributional and local transport properties from the underlying SDE.

The method derives a scalable loss function via the Stratonovich integral identity, converting what would be an inaccessible regression problem (since v(t,x) values are unavailable in the data) into a pathwise objective depending only on observed trajectory increments.

2. Methodological Rigor

The theoretical framework is well-developed with multiple complementary results:

Proposition 1 provides a variance-bias decomposition for the empirical FTM loss, cleanly separating discretization error (controlled by h) from sampling variance (controlled by N and chunk length τ). This analysis justifies the practical one-step loss when Nh is sufficiently large.

Proposition 2 gives standard Wasserstein-2 bounds on marginal-matching error via Grönwall arguments.

Proposition 3 is specific to FTM and bounds errors for path-dependent QoIs, showing that learning the current velocity (rather than any marginal-matching velocity) controls Stratonovich-integral-type observables.

The theory is primarily developed for finite-dimensional additive-noise SDEs with uniform boundedness assumptions—the authors honestly acknowledge this does not fully cover their PDE experiments. The proofs use standard techniques but are assembled carefully to support the specific claims about when and why the one-step loss is stable.

The experiments span four systems of increasing complexity: Duffing oscillator (2D), Rayleigh-Bénard convection (9D chaotic), stochastic Burgers (64D PDE), and Navier-Stokes turbulence (64×64 fields). Baselines include operator learning, DICE, marginal-only flows, SDE learning/matching, autoregressive diffusion models (ARDM), conditional flow matching (CFM), and mean-flow distillation. The comparison is thorough and fair, using matched architectures.

3. Potential Impact

Immediate applications: Fast ensemble prediction for stochastic/chaotic/turbulent systems is a pressing need in weather forecasting, climate modeling, fluid dynamics, and uncertainty quantification. FTM's ability to produce ensemble predictions at deterministic-rollout cost (1 NFE per time step vs. 50-100 for ARDM) while maintaining accuracy is practically significant.

Broader methodological impact: The paper bridges concepts from stochastic thermodynamics (probability currents, Stratonovich identities) with modern surrogate modeling, potentially opening new directions. The framework could influence how the community thinks about what should be learned from stochastic trajectory data—not just conditional means or marginals, but the transport structure.

Limitations on impact: FTM cannot reproduce martingale-dominated statistics (hitting times, temporal autocorrelations). This is a fundamental limitation of the deterministic ODE approach and limits applicability in domains where such quantities matter.

4. Timeliness & Relevance

This paper addresses a genuine bottleneck at the intersection of scientific computing and machine learning. The proliferation of neural operator approaches for PDE surrogate modeling has exposed the mean-collapse problem in stochastic settings. Simultaneously, autoregressive generative models (diffusion/flow-based) solve this but at prohibitive inference costs for long rollouts. FTM occupies a valuable middle ground that the field needs.

The connection to probability flow ODEs (used extensively in generative modeling) is timely, but the paper clearly distinguishes its setting—learning physical-time transport from observed trajectories rather than artificial sampling-time transport.

5. Strengths & Limitations

Key Strengths:

Elegant theoretical framing that makes the probability current velocity the natural and unique learning target for deterministic surrogates

The one-step loss is remarkably simple: simulation-free, local in time, avoids drift/diffusion/score estimation

Empirical results are compelling: FTM achieves best or near-best errors with 1 NFE per step versus 20-100 for generative baselines

The stability analysis (Proposition 1, Figure 2) provides practical guidance on when the one-step loss suffices

Clean separation of what FTM can and cannot capture (current-like vs. martingale-dominated quantities)

Notable Weaknesses:

Theory-experiment gap: theoretical guarantees apply to finite-dimensional additive-noise SDEs but experiments include SPDEs where these don't formally hold

The additive noise assumption (A(t) independent of X(t)) is restrictive for many real applications

The paper doesn't address multiplicative noise or state-dependent diffusion settings

Scalability to very high-dimensional systems (e.g., 3D turbulence, full-resolution climate models) remains untested

The comparison with distilled models (MeanFlow) suggests distillation struggles in this setting, but the distillation approach may not have been fully optimized

Reproducibility: The paper provides extensive experimental details, architecture specifications, and hyperparameters. The mathematical framework is self-contained with complete proofs.

Summary

FTM represents a well-motivated and cleanly executed contribution that identifies and exploits a fundamental structure—the probability current velocity—for fast ensemble prediction. The combination of theoretical grounding, practical simplicity, and strong empirical performance across multiple benchmarks makes this a significant contribution to scientific machine learning.

Rating:7.8/ 10

Significance 8Rigor 7.5Novelty 8Clarity 8.5

Generated Jun 10, 2026

Comparison History (18)

Wonvs. Beyond representational alignment with brain-guided language models for robust reasoning

Paper 1 introduces a novel and rigorous surrogate-modeling method (FTM) for ensemble predictions of stochastic/chaotic systems with strong theoretical grounding (stability analysis) and broad applicability across scientific computing domains (turbulence, stochastic PDEs). It addresses a fundamental computational bottleneck with clear practical impact. Paper 2, while creative in using brain signals to guide LLMs, faces scalability limitations (reliance on fMRI data), modest improvements, and builds on a less robust premise—the gains from brain-guided steering may not generalize beyond narrow benchmarks. Paper 1's methodological contribution is more foundational and broadly impactful.

claude-opus-4-6·Jun 11, 2026

Wonvs. K-Forcing: Joint Next-K-Token Decoding via Push-Forward Language Modeling

Paper 2 (FTM) has higher potential scientific impact due to broader cross-domain applicability (stochastic dynamics, turbulence, PDEs), a conceptually novel surrogate-learning target (probability current velocity) that bypasses drift/diffusion/score estimation, and stronger methodological rigor via stability analysis separating discretization and sampling errors. Its applications span physics, climate/weather, engineering, and UQ, making it timely for fast ensemble prediction needs. Paper 1 is impactful for LLM serving efficiency but is more domain-specific, closer to existing distillation/parallel decoding lines, and its benefits trade off with quality degradation.

gpt-5.2·Jun 10, 2026

Wonvs. TRACE: A Unified Rollout Budget Allocation Framework for Efficient Agentic Reinforcement Learning

Paper 2 introduces a fundamentally new surrogate-modeling method (FTM) with broad applicability across stochastic dynamical systems, turbulence, and chaotic systems. It addresses a foundational problem in computational physics and applied mathematics—efficiently predicting ensemble statistics of complex stochastic systems—with strong theoretical grounding (stability analysis) and wide cross-disciplinary relevance (climate, fluid dynamics, molecular dynamics). Paper 1, while useful, presents an incremental optimization framework for agentic RL with moderate empirical gains (2.8 points) on specific benchmarks, targeting a narrower problem within LLM training methodology.

claude-opus-4-6·Jun 10, 2026

Wonvs. SPACR: Single-Pass Adaptive Training of Uncertainty-Aware Conformal Regressors

Paper 2 is likely higher impact: it proposes a broadly applicable surrogate-modeling framework for stochastic/chaotic dynamical systems, with implications across physics, climate, turbulence, PDEs, and uncertainty quantification. Learning probability current velocity directly from trajectories (without estimating drift/diffusion/score) is a distinctive innovation, and the inclusion of stability analysis strengthens methodological rigor. Its potential real-world applications (fast ensemble prediction and current/flux estimation) are substantial and timely for scientific computing. Paper 1 is valuable for ML uncertainty, but its impact is narrower to conformal regression tooling.

gpt-5.2·Jun 10, 2026

Wonvs. XtrAIn: Training-Guided Occlusion for Feature Attribution

Paper 2 is likely higher impact: it proposes a broadly applicable surrogate modeling framework for stochastic/chaotic/turbulent dynamical systems with direct relevance to physics, climate, fluid dynamics, and uncertainty quantification. The method avoids estimating drift/diffusion/score, offers stability analysis, and targets ensemble statistics and current-like quantities—capabilities valuable across many scientific domains and timely for fast simulation and surrogate modeling. Paper 1 is a novel XAI contribution with practical utility, but its impact is narrower (primarily ML interpretability) and may face adoption barriers due to training-trajectory dependence and compute overhead.

gpt-5.2·Jun 10, 2026

Wonvs. EEVEE: Towards Test-time Prompt Learning in the Real World for Self-Improving Agents

Paper 2 (FTM) has higher likely scientific impact due to stronger methodological novelty and broader cross-field relevance: it offers a principled surrogate modeling framework for stochastic/chaotic dynamics that avoids drift/diffusion/score estimation, includes stability analysis, and targets widely important problems (turbulence, stochastic PDEs) with clear real-world applications in physics, climate, engineering, and UQ. Paper 1 is timely and practically useful for LLM agents, but test-time prompt routing/co-evolution is more incremental within a fast-moving, benchmark-driven area and may be superseded by model-architecture advances.

gpt-5.2·Jun 10, 2026

Lostvs. Beyond Uniform Token-Level Trust Region in LLM Reinforcement Learning

Paper 2 likely has higher impact due to timeliness and broad applicability: it targets RL for LLMs, a rapidly moving area with immediate industry and research uptake. CPPO addresses a widely used method (PPO-style trust regions) with a concrete, implementable modification that can transfer across models, tasks, and RLVR setups, potentially influencing many follow-on works. Paper 1 is methodologically strong and novel for stochastic dynamics surrogates, but its impact is more specialized (chaotic/turbulent/PDE systems) and may diffuse slower across fields than an LLM-RL optimization improvement.

gpt-5.2·Jun 10, 2026

Wonvs. N-GRPO: Embedding-Level Neighbor Mixing for Enhanced Policy Optimization

Paper 1 introduces a fundamentally new surrogate-modeling framework (FTM) for ensemble prediction of chaotic and stochastic systems, with broad applicability across scientific computing, climate modeling, turbulence, and stochastic PDEs. It provides rigorous stability analysis and avoids costly drift/diffusion/score estimation. Paper 2 offers an incremental improvement to LLM training (embedding-level neighbor mixing for GRPO), which is narrower in scope and more of an engineering contribution within an existing framework. Paper 1's methodological novelty, theoretical depth, and cross-disciplinary relevance give it substantially higher potential impact.

claude-opus-4-6·Jun 10, 2026

Wonvs. Convergence of Two-Timescale Markovian Stochastic Approximations with Applications in Reinforcement Learning

Paper 1 introduces a novel surrogate modeling technique that significantly accelerates ensemble predictions for chaotic and stochastic systems, including PDEs. Its ability to capture complex physical quantities at low computational cost offers broad, cross-disciplinary impact in fields like fluid dynamics, climate modeling, and physics. While Paper 2 provides crucial theoretical convergence guarantees for reinforcement learning algorithms, Paper 1's methodological innovation and direct applicability to simulating large-scale, real-world physical systems give it a higher potential for widespread scientific impact.

gemini-3.1-pro-preview·Jun 10, 2026

Lostvs. Tight Sample Complexity of Transformers

Paper 2 provides fundamental, tight theoretical bounds on the sample complexity and VC dimension of Transformers, including chain-of-thought reasoning. Given the dominant role of Transformers in modern AI, these foundational theoretical results are highly timely and likely to have a massive, broad impact on machine learning theory and the understanding of large language models, surpassing the more specialized, though innovative, surrogate modeling approach presented in Paper 1.

gemini-3.1-pro-preview·Jun 10, 2026

#595of 5669·cs.LG

#595 of 5669 · cs.LG

Tournament Score

1497±45

10501750

78%

Win Rate

Wins

Losses

Matches

Rating

7.8/ 10

Significance8

Rigor7.5

Novelty8

Clarity8.5