Kyungeun Kim, Amanuel Anteneh, Israel Klich, Olivier Pfister, J. M. Schwarz
Responses to perturbations are key to understanding physical systems. The ability to contrast such responses by comparing how a system reacts under slightly different conditions provides a mechanism for learning. Here, we introduce Perturbative Contrastive Physical Learning (PCPL), a general framework in which learning emerges from measurable contrasts between physical states produced by controlled changes to inputs, boundary conditions, parameters, or interpreter functions. PCPL unifies and extends prior approaches: Equilibrium Propagation is rooted in contrasts between free and nudged equilibria in energy-based systems, while Frequency Propagation corresponds to contrasts extracted from sinusoidally driven, frequency-demodulated responses. We show that contrast-driven updates can reflect either local sensitivities or global inverse-problem structure, yet do not require centralized gradient computation. Instead, effective learning geometry emerges implicitly from the system's own physical response, allowing learning behavior to arise without an external processor or explicit backpropagation. We demonstrate PCPL in two platforms: (i) spring networks that update bond stiffness using measured displacements and forces, and (ii) continuous-variable photonic circuits trained via x quadrature measurements and finite-difference estimates of the Jacobian. Both platforms successfully learn classification tasks. We further show that a continuous-variable photonic circuit can be trained to implement analog multiplication, illustrating a step toward more autonomous physical learning systems.
The paper introduces Perturbative Contrastive Physical Learning (PCPL), a framework that formalizes learning in physical systems as arising from measurable contrasts between nearby physical states produced by controlled perturbations. The key conceptual move is elevating "contrast" to the central primitive of physical learning, rather than gradients, energy functions, or equilibrium conditions. The framework distinguishes two modes: Mode A (self-referenced contrast, analogous to linear response probing) and Mode B (target-referenced contrast, performing local inversion via pseudoinverse/Gauss-Newton geometry). The authors argue this unifies Equilibrium Propagation, Coupled Learning, Frequency Propagation, and related schemes under a single conceptual umbrella.
Two physical platforms are demonstrated: spring networks classifying the Iris dataset via bond stiffness updates, and continuous-variable (CV) photonic circuits using x-quadrature measurements. Additionally, a linear optical multiplier is presented as a step toward autonomous physical learners where gradient computation migrates into the physical substrate.
The theoretical framework is presented with reasonable formalism, distinguishing the two modes and connecting them to Jacobian-based update rules. However, several concerns emerge:
Spring network experiments: The Iris classification task is extremely simple (150 samples, 4 features, 3 classes), and the classification strategies (Cases 1-3) involve increasing amounts of hand-engineered decision boundaries (adaptive thresholds, hierarchical rules). The claim of 100% test accuracy for Case 2 is appropriately caveated as a "finite-sample geometric effect," but it remains unclear how much learning the spring network actually performs versus how much is accomplished by the post-hoc classification rule design. The 50/50 train-test split with specific partitions showing perfect accuracy is not convincing without cross-validation.
Photonic circuit experiments: These are simulations of Gaussian-state evolution, not physical experiments. The linear optical circuit essentially implements a linear classifier (⟨x̂₀⟩ = W·x), with nonlinearity entering only through the update dynamics. The 97.7% accuracy on Iris is reasonable but unremarkable for what is effectively a well-optimized linear classifier. The ablation study (Table II) showing identical performance across all four configurations (CV vs. classical) is presented as validation but actually undercuts the motivation: if the CV components offer zero advantage, the physical implementation adds complexity without benefit for this task.
The pseudoinverse update (Mode B) is well-characterized and the comparison with gradient descent (Appendix A) showing robustness to learning rate and ill-conditioning is a useful practical observation, though pseudoinverse methods are well-established in machine learning.
The paper's ambition to create a unifying framework for physical learning is laudable and addresses a genuine need as the field of physical neural networks grows. However, the actual demonstrations are quite limited:
The most promising direction is the connection to CV quantum photonics, where the formalism could potentially extend to quantum learning settings. The paper correctly identifies this as future work. The framework could influence how researchers think about designing physical learning systems, but the practical impact depends on scaling beyond toy problems and moving to experimental demonstrations.
Physical learning is a timely and growing area, with recent Nature publications (Momeni et al., 2025; Wright et al., 2022) and active theoretical development. The desire to unify various physical learning approaches (EP, Coupled Learning, Frequency Propagation) under one umbrella is well-motivated. However, other recent works have also attempted such unification. The photonic angle connects to the active area of photonic computing, though the Gaussian-state restriction limits the quantum advantage narrative.
The paper's writing is generally clear but verbose, with significant space devoted to notation and formalism that could be condensed. The appendices are useful but the ablation in Table II (showing identical results across all configurations) inadvertently weakens the case for the CV photonic implementation. The DSR squeezing multiplier (Appendix C) shows inconsistent performance, highlighting practical challenges that are somewhat glossed over in the main text.
Generated Jun 9, 2026
Paper 1 addresses a highly practical and timely problem in LLM efficiency—causal attention approximation with formal guarantees—and demonstrates concrete speedups over the widely-used FlashAttention 2 across multiple real-world bottlenecks (prefill, KV cache, decoding). Given the massive scale of current LLM deployment, improvements here have immediate and broad impact. Paper 2 presents a theoretically interesting unification of physical learning frameworks, but its demonstrations remain limited to small-scale toy tasks (spring networks, simple classification), and the path to practical impact in physical computing hardware is still long-term and uncertain.
Paper 2 is likely higher impact: it offers a clear, broadly applicable theoretical framework (phase diagram) that unifies and explains when two dominant multimodal paradigms succeed or fail, plus a practical dataset-diagnosis procedure validated across multiple domains with released code. This combination of timeliness (multimodal surge), methodological rigor (derivations under a defined model), and immediate practitioner utility suggests wide adoption across ML and scientific applications. Paper 1 is innovative and potentially transformative for physical learning hardware, but its impact may be narrower and longer-term, hinging on experimental scalability and platform adoption.
While Paper 1 offers a valuable efficiency improvement for LLM reinforcement learning, Paper 2 proposes a foundational paradigm shift by unifying physical learning mechanisms. By enabling gradient-free learning directly in physical systems (like photonics and mechanical networks), Paper 2 bridges physics and machine learning, promising profound long-term impacts on the development of neuromorphic hardware and analog computing.
Paper 1 is more scientifically novel and broadly impactful: it proposes a unifying framework (PCPL) connecting multiple physical-learning paradigms and demonstrates learning in distinct physical substrates (mechanical networks, photonic circuits), suggesting new routes for autonomous, hardware-native learning with implications across physics, materials, photonics, and neuromorphic/analog computing. Paper 2 is timely and practically valuable for LLM post-training, but its core contribution (token-mapping to enable OPD across tokenizers) is more incremental and likely narrower in cross-field scientific reach despite strong applied relevance.
Paper 1 is a comprehensive review article proposing a unifying framework (REO) and phase diagram for data-driven differential equation discovery—a rapidly growing field at the intersection of AI and scientific discovery. Its broad scope, organizational contribution, and relevance to multiple scientific domains give it high citation potential and wide influence. Paper 2, while novel in introducing PCPL as a framework for physical learning, addresses a more niche topic (physical computing/neuromorphic systems) with demonstrations limited to relatively simple tasks. The review's ability to shape an entire field's research directions gives it greater estimated impact.
Paper 2 introduces a broad, general framework for physical learning that bridges machine learning and physical hardware (analog computing, photonics). Its ability to enable learning without explicit backpropagation offers significant real-world impact for future AI hardware. Paper 1, while methodologically rigorous and valuable for generative modeling theory, is more narrowly focused on the specific problem of size extrapolation in score-based models.
Paper 2 likely has higher impact: it proposes a general, experimentally grounded learning framework (PCPL) that unifies prior physics-based learning methods and enables training without centralized gradients/backprop, aligning with major trends in energy-efficient/neuromorphic and physical AI. It demonstrates on two distinct hardware platforms (mechanical networks, photonics) with concrete tasks, suggesting near-term real-world applicability and cross-field relevance (physics, ML, photonics, robotics). Paper 1 is a compelling agenda for PDE solvers but reads more conceptual and may require substantial follow-up for validation and adoption.
PCPL introduces a unifying framework for physical learning that spans multiple physical platforms (mechanical and photonic), connecting and generalizing prior approaches like Equilibrium Propagation and Frequency Propagation. It addresses the timely and broadly impactful challenge of training physical systems without backpropagation, with implications for neuromorphic computing, analog AI hardware, and autonomous learning systems. Paper 2 contributes a novel use of HRR for disentanglement with solid theoretical grounding, but addresses a narrower problem within representation learning. PCPL's cross-disciplinary reach (physics, ML, photonics, materials) and hardware relevance give it higher potential impact.
Paper 2 has higher impact potential: it proposes a broad, unifying framework (PCPL) for learning via physical perturbation contrasts, extending multiple prior paradigms and enabling learning without explicit backprop/centralized gradients. It demonstrates feasibility across distinct hardware platforms (mechanical spring networks and photonic circuits) and tasks (classification, analog multiplication), suggesting real-world relevance for neuromorphic/physical AI and inverse problems. Paper 1 is useful and timely for privacy accounting practice, but is primarily a parameter-mapping/standardization contribution with narrower cross-field reach and less conceptual novelty.
Paper 1 offers a novel, unifying framework (PCPL) that generalizes prior physical learning schemes and demonstrates it on two distinct physical platforms (mechanical spring networks and photonic circuits), including both ML tasks and an analog computation primitive. This combination of conceptual innovation, methodological development, and cross-domain applicability (physics, neuromorphic/analog computing, photonics, learning theory) suggests broader and longer-term impact. Paper 2 addresses an important applied issue (calibration in probabilistic price forecasting) but appears more problem-framing than method-advancing, with narrower cross-field reach.