Siyu Lou, Hao Xu, Wenguan Wang, Lu Lu, Hao Sun, Yang Liu, Linfeng Zhang, Dongxiao Zhang
Differential equations play a critical role in scientific discovery because they provide a mathematical framework to describe the behaviour of physical phenomena. As a promising alternative to traditional first principles, data-driven differential equation discovery has attracted increasing attention for its ability to infer governing laws directly from experimental or simulated data, especially when the underlying physics is unclear. However, the field has expanded rapidly along diverse methodological directions, particularly with the emergence of AI-based approaches, and still lacks a clear organizing perspective. In this Review, we propose a problem-oriented perspective on data-driven differential equation discovery. We first introduce a two-dimensional phase diagram of equation discoverability, where discovery problems are organized according to structural complexity and coefficient complexity. This phase diagram shows how the field has moved from the discovery of sparse equations with simple coefficients toward more complex governing laws with richer structures and more flexible parameterizations. It also clarifies why different methodological families succeed or fail in different problem settings. We then present the representation-evaluation-optimization (REO) framework as a fundamental abstraction of the discovery process. By identifying the core problems of equation discovery that persist across algorithmic variations, REO shifts the discussion from individual algorithms to the fundamental principles that determine discoverability. We connect these perspectives to applications across physics and adjacent sciences, and argue that the next challenge is not merely recovering equations, but using them to revise existing theories, distil mechanisms and form new scientific concepts.
This paper is a review/perspective article that proposes two organizing frameworks for the rapidly expanding field of data-driven differential equation discovery: (1) a two-dimensional "phase diagram of equation discoverability" that maps discovery problems along axes of structural complexity and coefficient complexity, and (2) a representation–evaluation–optimization (REO) framework that abstracts the discovery process into three fundamental components. The paper does not introduce new algorithms, datasets, or experimental results. Its contribution is entirely conceptual and organizational.
The phase diagram is genuinely useful. By positioning methods along structural complexity (closed-library → expandable-library → open-form) and coefficient complexity (constant → equation-expressible → equation-inexpressible), the paper provides an intuitive map that clarifies why certain methods succeed or fail in different regimes. The observation that the upper-right corner (open-form structure with inexpressible coefficients) remains largely unexplored is a concrete and actionable insight for the community.
The REO framework, while reasonable, is less novel. Decomposing any inference pipeline into "how you represent candidates," "how you evaluate them," and "how you search" is a fairly natural abstraction that many readers would already implicitly understand. The paper acknowledges this generality but does not push the framework far enough to generate surprising insights—for instance, it does not formally characterize how representation choices constrain optimization landscapes or derive theoretical limits on discoverability.
As a review paper, rigor is assessed differently than for an empirical contribution. The literature coverage is comprehensive, spanning ~164 references across sparse regression (SINDy family), neural-network-based methods (DeepMoD, PINN-SR, PDE-Net), evolutionary approaches (DLGA, EPDE), reinforcement learning methods (DSR, DISCOVER), and emerging LLM-based approaches. The supplementary material provides useful tables of benchmark ODEs/PDEs, open-source software, and detailed "box" descriptions of representative methods (PDE-FIND, DSR/DISCOVER, ODEFormer).
However, several gaps weaken the rigor:
The paper addresses a genuine need: the field of equation discovery has fragmented across algorithmic families, and practitioners struggle to navigate the landscape. The phase diagram provides an accessible entry point, and the REO framework offers common vocabulary. This could:
1. Guide method selection: Researchers facing a specific discovery problem can locate it on the phase diagram and identify appropriate method families.
2. Identify research gaps: The unexplored upper-right regime (open-form + inexpressible coefficients) is clearly delineated as a frontier.
3. Standardize evaluation: The paper's call for standardized benchmarks, transparent noise protocols, and multi-dimensional evaluation (accuracy, conciseness, physical consistency, solvability) could catalyze community convergence.
4. Bridge communities: By connecting applications across fluid dynamics, biology, geoscience, chemistry, and traffic modeling under a common framework, the paper may facilitate cross-pollination.
The proposal of "solvability" as a new evaluation dimension is a small but meaningful contribution—discovered equations that cannot be numerically solved have limited scientific utility, yet this criterion is rarely discussed.
The timing is appropriate. The field is experiencing rapid expansion driven by LLM-based approaches (LLM4ED, LLM-SR, EqGPT, Scientific Generative Agent), transformer-based methods (ODEFormer), and reinforcement learning frameworks. These developments have diversified the methodological landscape to the point where an organizing perspective is valuable. The paper's coverage of these very recent methods (many from 2024-2025) ensures currency.
The paper also aligns with broader trends in AI for science, where interpretability and mechanistic understanding are increasingly valued over pure prediction accuracy.
This is a well-organized, timely review that provides useful conceptual frameworks for navigating the equation discovery literature. Its primary value lies in the phase diagram, which offers a genuinely new lens for organizing the field. The paper will likely serve as a useful reference and teaching tool, though its impact would be greater with more formal theoretical grounding.
Generated Jun 9, 2026
Paper 2 addresses the rapidly growing field of AI for Science, offering a unifying framework for discovering differential equations from data. While Paper 1 provides a rigorous theoretical foundation for machine learning, Paper 2 boasts exceptional breadth of impact across physics, biology, and engineering. Its highly timely focus on data-driven scientific discovery and high potential for real-world applications give it a higher estimated scientific impact, as framework papers in this cross-disciplinary area typically shape diverse future research and garner extensive citations.
Paper 1 presents a novel, primary theoretical framework addressing a fundamental gap in multimodal learning. It offers a practical diagnostic tool with immediate applicability across diverse scientific domains like astrophysics and biomedicine. While Paper 2 is a highly valuable review that synthesizes existing literature on data-driven equation discovery, Paper 1 introduces original methodology and mathematical theory. This primary innovation in a rapidly expanding field like multimodal AI gives it a higher potential for driving direct methodological advances and widespread real-world application.
Paper 1 introduces a new ST-guided alignment method (STAMP) plus a large curated dataset (HumanST-1k, 1.8M paired histology–transcriptomics samples), enabling molecularly supervised foundation models with clear downstream clinical utility in precision oncology. This combination of methodological innovation, resource creation, and translational relevance suggests strong, sustained impact. Paper 2 is a high-level review offering useful conceptual frameworks, likely influential for organizing a field, but it does not present new primary methods or datasets and thus may have comparatively less direct scientific/technological impact.
Paper 2 provides a foundational framework and organizing perspective for a highly impactful, cross-disciplinary field (data-driven discovery of physical laws). Review papers that unify rapidly expanding methodologies often become highly cited landmarks. While Paper 1 offers valuable insights into a specific optimization algorithm, its scope is narrower and confined to machine learning, whereas Paper 2 spans physics, AI, and adjacent sciences with profound implications for scientific discovery.
Paper 1 has higher likely scientific impact: it synthesizes a rapidly growing cross-disciplinary area (data-driven PDE/ODE discovery), introduces unifying conceptual frameworks (discoverability phase diagram; REO abstraction), and targets broad foundational challenges relevant to physics, engineering, and scientific ML. As a Review, it can shape research agendas and methodology across many domains, making it timely and wide-reaching. Paper 2 is methodologically solid and practically useful within sports analytics, but its novelty and breadth are narrower and its applications are more domain-specific.
Paper 1 is a foundational review that proposes a unifying framework (REO) and a phase diagram for the rapidly expanding field of data-driven equation discovery. By synthesizing diverse AI-based methodologies and outlining future challenges in revising scientific theories, it has the potential to broadly shape the research agenda across physics and adjacent sciences. While Paper 2 offers a strong technical advancement in neural operators, Paper 1's conceptual synthesis and broad interdisciplinary relevance give it a higher potential for widespread scientific impact.
Paper 2 likely has higher scientific impact because it provides a unifying, problem-oriented framework (discoverability phase diagram + REO abstraction) for a fast-growing area spanning many physical and life sciences. This can shape research directions, standardize thinking across methods, and influence broad application domains (physics, engineering, chemistry, biology). Paper 1 is timely and practically useful for efficient LLM inference, but its impact is narrower (systems/ML inference optimization) and more incremental relative to ongoing work on dynamic routing/pruning. Overall breadth and cross-field relevance favor Paper 2.
Paper 1 is a comprehensive review proposing novel organizing frameworks (phase diagram of discoverability, REO framework) for the rapidly growing field of data-driven equation discovery. It spans multiple scientific domains, offers conceptual clarity to a fragmented field, and charts future directions connecting equation discovery to theory revision and scientific concept formation. Its breadth of impact across physics and adjacent sciences, combined with its timely synthesis of AI-driven scientific discovery methods, gives it substantially higher potential impact than Paper 2, which addresses the narrower (though practically useful) problem of tensor program optimization with incremental performance improvements over existing auto-schedulers.
Paper 2 is a broad, problem-organizing Review that introduces unifying frameworks (discoverability phase diagram; REO abstraction) for data-driven differential equation discovery across many physical systems. Its potential impact is wide across physics, engineering, and scientific ML, shaping how researchers frame problems, compare methods, and identify open challenges—often leading to high citation and cross-field uptake. Paper 1 is a solid, timely algorithmic improvement for LLM RL stability, but it is narrower in scope and likely incremental within a fast-moving subarea where methods can be quickly superseded.
Paper 2 is a comprehensive review that proposes a unifying framework for a rapidly expanding, high-impact field (data-driven physics). Its breadth across multiple scientific domains and its ability to shape future research directions give it a higher potential scientific impact than Paper 1, which presents a specialized, albeit novel, architectural improvement for specific dynamical systems.