Ana Larrañaga, Urban Fasel, Steven L. Brunton
Identifying the governing equations of complex dynamical systems remains a fundamental challenge across science and engineering. While early approaches relied on empirical data and heuristics, modern data-driven methods offer greater flexibility and fewer assumptions. However, data acquisition in real-world settings is often expensive. This work addresses this challenge by introducing an active learning strategy for dynamics discovery in the ultra-low data limit. Rather than sampling randomly, our method iteratively prioritizes regions that are most informative for model identification. This approach builds on Sparse Identification of Nonlinear Dynamics (SINDy), and utilizes an ensemble extension, E-SINDy, to estimate epistemic uncertainty and guide the sampling for both ordinary and partial differential equations (ODEs/PDEs). For ODEs, an exhaustive analysis is conducted on the Lorenz system across varying data budgets and noise levels. For PDEs, two systems with contrasting dynamical characteristics are examined: the Burgers' equation, where a sharp shock front creates a distinction between informative and uninformative regions, and the Kuramoto-Sivashinsky equation, which presents a more spatially complex sampling landscape. Across all scenarios, the proposed method accurately identifies the governing dynamics with significantly fewer data samples than random sampling.
This paper introduces an active learning framework for discovering governing equations of dynamical systems using minimal data, built upon the Sparse Identification of Nonlinear Dynamics (SINDy) and its ensemble extension (E-SINDy). The key idea is to use ensemble disagreement (epistemic uncertainty) as an acquisition function to iteratively select the most informative samples—initial conditions for ODEs and spatiotemporal points for PDEs. The paper claims to operate in an "ultra-low-data limit," recovering the Lorenz system with ~50-100 data points and the Burgers equation with ~24 points.
The contribution is essentially the integration of query-by-committee active learning with E-SINDy, along with a practical convergence criterion and two acquisition functions (ensemble-based and D-optimal) for PDEs. While neither active learning nor E-SINDy is new, their systematic combination and evaluation in the ultra-low-data regime for equation discovery is a useful contribution.
The practical value is clear for applications where data collection is expensive—experimental fluid mechanics, materials science, climate modeling. The framework is computationally lightweight compared to Bayesian alternatives, which is important for scalability.
However, the impact may be limited by several factors:
1. The systems studied are relatively low-dimensional and well-understood. Scaling to higher-dimensional systems or systems with unknown library terms remains unaddressed.
2. The approach assumes the correct library of candidate functions is known a priori—a significant assumption in practice.
3. The insight that sampling away from attractors is informative (Fig. 7) is interesting but may not generalize to systems where off-attractor dynamics are physically inaccessible or unmeasurable.
The finding that active learning concentrates samples away from the attractor (for ODEs) and near sharp gradients (for PDEs) provides useful intuition for experimental design, even independent of the algorithmic framework.
The paper addresses a genuinely important problem. Data efficiency in scientific machine learning is increasingly relevant as the field moves toward experimental validation. The SINDy community is active and growing, and tools for reducing data requirements are needed. The combination of active learning with interpretable model discovery is timely, particularly given the push toward trustworthy AI in scientific applications.
However, the paper arrives after Fasel et al. (2022) already introduced E-SINDy with active learning concepts. The present work extends this more systematically to the ultra-low-data regime and to PDEs, but the conceptual novelty over the prior work is incremental.
1. Clear problem formulation: The paper precisely defines the ultra-low-data regime and provides concrete data budgets for each benchmark.
2. Practical algorithm: Both Algorithms 1 and 2 are clearly stated and reproducible.
3. Insightful analysis: The observation about sampling away from attractors (Fig. 3, Hopf oscillator; Fig. 7, Lorenz) provides actionable guidance for experimentalists.
4. Comprehensive evaluation: Multiple metrics (ℓ₀, ℓ₂, relative residual), noise levels, and repetitions provide a thorough picture.
5. Well-chosen PDE benchmarks: Burgers and KS test complementary aspects of the method.
1. Limited novelty: The core components (E-SINDy, QbC, D-optimal design) are all established; the contribution is primarily in their integration and systematic evaluation.
2. No real data experiments: All benchmarks use synthetic data from known equations, leaving practical applicability uncertain.
3. Scalability concerns: Only systems with 3 state variables (ODE) or 1 field variable (PDE) are tested. Higher-dimensional systems with larger libraries would stress the method differently.
4. Missing theoretical analysis: No sample complexity bounds or convergence guarantees are provided. The paper is entirely empirical.
5. Limited comparison: Only random sampling and D-optimal are compared; other active learning strategies and recent SINDy variants (Weak-SINDy, Bayesian SINDy) are discussed but not benchmarked.
6. Noise model is simple: Only additive Gaussian noise is considered; multiplicative noise, model misspecification, or missing library terms are not addressed.
This is a well-executed engineering contribution that provides practical tools and useful insights for the SINDy community, particularly regarding data-efficient equation discovery. The experimental analysis is thorough within its scope. However, the conceptual novelty is moderate—it primarily combines existing techniques—and the lack of theoretical guarantees, real-world validation, and broader comparisons limits the potential for high impact. The paper would benefit significantly from application to at least one real experimental dataset and from theoretical analysis of when and why the approach should outperform alternatives.
Generated Jun 11, 2026
Paper 2 has higher potential impact due to broader cross-disciplinary relevance (scientific machine learning, system identification, experimental design across physics/engineering/biology), clear real-world applicability in expensive-data regimes, and timely alignment with active learning and sparse discovery. Its methodological contribution (uncertainty-driven sampling with E-SINDy) is directly actionable for ODE/PDE discovery and validated on canonical systems. Paper 1 is novel and theoretically rigorous for distributed optimization, but its impact is more specialized to asynchronous SGD settings and may be narrower in application scope.
Paper 1 introduces a fundamentally new conceptual framework—loss shift as distinct from distribution shift—in transfer learning, formalized through Bayes quotients. This identifies a previously unrecognized failure mode with broad theoretical implications across machine learning. The novelty of reframing representation sufficiency in terms of loss refinement, with exact quantitative characterizations, opens new research directions. Paper 2, while practically useful, is a more incremental contribution combining existing methods (SINDy, ensemble methods, active learning) in a well-studied problem space. Paper 1's conceptual originality and breadth of theoretical impact give it the edge.
Paper 1 addresses a broadly impactful problem—discovering governing equations from minimal data—relevant across science and engineering. Its active learning strategy for SINDy in ultra-low data regimes has clear practical applications in domains where data acquisition is expensive. The methodology is rigorous, tested on multiple ODE/PDE systems with varying complexity and noise. Paper 2 provides useful interpretability insights for a specific architecture (Block AttnRes) but has narrower scope, addressing a niche architectural question with findings (routing mass ≠ causal importance) that, while valuable, impact a smaller research community.
Paper 2 introduces a novel benchmark and dataset at the highly active intersection of LLMs and chemistry (AI4Science). By providing the first standardized evaluation and a large specialized corpus (SupraPMC) for supramolecular chemistry, it fills a critical gap that will likely drive widespread community engagement and model development. While Paper 1 offers a valuable algorithmic advancement for dynamical systems, foundational benchmarks and datasets like those in Paper 2 typically generate broader interdisciplinary impact, higher citation rates, and accelerate real-world applications in molecular design.
HAMNO introduces a novel neural operator architecture with hierarchical multi-scale processing and adaptive gating, combined with a physics-informed extension using both strong and weak-form constraints. This addresses fundamental challenges in learning PDE solutions (multi-scale, long-range, stability) with broad applicability. While Paper 2 presents a useful active learning strategy for SINDy in low-data regimes, it is more incremental, building on established methods. HAMNO's architectural innovations, comprehensive evaluation across multiple equations, and publicly available implementation suggest broader impact across computational science and ML-for-science communities.
Paper 2 introduces a highly novel approach by adapting LLM prompt engineering techniques (Chain of Operators) to neural operators. This enables zero-shot generalization to out-of-distribution tasks without retraining, addressing a major bottleneck in scientific ML. While Paper 1 provides a valuable active learning method for data-efficient SINDy, Paper 2's connection between foundation model methodologies and operator learning offers broader transformative potential across computational physics and AI.
Paper 1 presents a fundamentally novel connection between Kuramoto synchronization dynamics and transformer attention, opening a new paradigm for implementing neural network computations on physical substrates. It bridges physics, dynamical systems, and deep learning with strong theoretical guarantees and empirical validation. The potential impact spans neuromorphic computing, energy-efficient AI hardware, and theoretical ML. Paper 2 makes a solid incremental contribution combining active learning with SINDy, but builds more directly on existing frameworks with narrower scope. Paper 1's interdisciplinary novelty and hardware implications give it broader transformative potential.
Paper 2 addresses a critical bottleneck in LLM interpretability by reviving and optimizing Independent Component Analysis (ICA) as a highly efficient alternative to expensive Sparse Autoencoders. Given the massive scale and urgent need for AI safety and alignment tools, a method that provides cheaper interpretable directions for modern LLMs has immense and immediate real-world utility. While Paper 1 offers valuable advances in computational physics and active learning, Paper 2's potential breadth of impact across the rapidly expanding AI landscape makes it more timely and scientifically impactful.
Paper 1 addresses a critical component (routing) in Mixture-of-Experts (MoE) architectures, which are central to scaling state-of-the-art large language models. By providing a mathematically grounded, theoretically proven, and empirically validated method at an 11B parameter scale, it offers massive immediate utility and impact within the rapidly expanding AI field. While Paper 2 offers valuable advances in scientific machine learning, the widespread adoption and foundational importance of MoE models give Paper 1 a higher potential for broad, transformative impact across AI research and industry.
Paper 2 addresses the broadly impactful problem of multimodal learning with missing modalities, which is pervasive across bioscience, clinical settings, and beyond. Its framework (LWR) offers a principled alternative to imputation-based methods with direct applications to cancer classification and survival prediction—high-stakes real-world tasks. Paper 1, while methodologically sound, focuses on a more niche improvement (active learning for SINDy in low-data regimes) within the dynamics discovery community. Paper 2's broader applicability across fields (multi-omics, clinical AI, multimodal ML) and immediate translational potential give it higher estimated impact.