Gianluca Scarpellini, Ron Shprints, Peter Holderrieth, Juno Nam, Pranav Murugan, Rafael Gómez-Bombarelli, Tommi Jaakola, Maruan Al-Shedivat
All-atom generative modeling of 3D biomolecular complexes has emerged as the dominant paradigm for predicting the structure of proteins and protein-ligand systems. Generating structures at the atomic level of fidelity, however, typically requires expensive iterative diffusion rollouts, making both conventional deployment and inference-time search techniques computationally costly. In this paper, we introduce the Denoiser Cofolding All-Atom Flowmap (DeCAF) framework for distilling state-of-the-art all-atom cofolding models into all-atom flow maps that produce high-quality samples in only a few inference steps. We build DeCAF on a denoiser-based formulation of flow maps with endpoint losses that naturally support SE(3) rigid alignment, which we show is critical for training accurate models. We further derive a simple change of variables that lets DeCAF operate in the σ-space noise schedule of EDM-style architectures, enabling direct distillation from pretrained cofolding diffusion models. Equipped with DeCAF's flowmap lookahead, we introduce a purpose-built inference-time framework that improves sampling through reward-guided search. Empirically, DeCAF-Boltz statistically improves over Boltz-1x in both accuracy (RMSD) and physical validity scores of protein-ligand poses at strict NFE budgets on the challenging Runs N' Poses, while also showing a more optimal Pareto frontier across all inference compute budgets on PoseBusters. Distilling the state-of-the-art Pearl cofolding model, DeCAF-Pearl outperforms diffusion-based cofolding models and matches its teacher on success rate while using 5x fewer NFEs. We release our code at https://github.com/genesistherapeutics/decaf.
DeCAF introduces a framework for distilling pretrained all-atom biomolecular cofolding diffusion models (specifically Boltz-1 and Pearl) into flow maps that generate high-quality protein-ligand structures in dramatically fewer inference steps (5-20× reduction in neural function evaluations). The paper makes three intertwined contributions:
First, a σ-space reparameterization of flow maps that eliminates the numerical instability arising from the chain rule through EDM noise schedules (avoiding the problematic ∂σ/∂t factor). This is paired with a denoiser-based parametrization that enables SE(3) rigid alignment via the Kabsch algorithm on predicted endpoints—shown to be critical for training stability through ablation (Table 3, where velocity and consistency distillation parametrizations catastrophically fail).
Second, DeCAF-SEARCH, an inference-time search framework that exploits the flow map's lookahead capability to evaluate terminal rewards (physical validity) on clean-space predictions rather than noisy intermediate states, unifying FK-steering, SMC resampling, and MCTS-style exploration under one umbrella.
Third, empirical validation on two challenging benchmarks (Runs N' Poses and PoseBusters) showing that DeCAF-Boltz matches full-budget Boltz-1x (600 NFE) with 20× fewer evaluations, and DeCAF-Pearl matches Pearl's success rate with 5× fewer NFEs.
The technical derivations are sound. The σ-space reformulation (Eq. 6-7) is a clean change of variables, and the connection between the two-time denoiser and the mean-flow/Eulerian objective (Eq. 8-10) is mathematically well-grounded. The endpoint loss with SE(3) alignment (Eq. 11) is a natural extension of standard practices in biomolecular diffusion models.
The experimental design is thorough:
One limitation is that the training compute (100 epochs on 64 H200 GPUs) is substantial, though this is a one-time cost. The paper does not report training time explicitly, making cost-benefit analysis difficult.
The practical impact is significant across several dimensions:
Drug discovery workflows: A 5-20× reduction in inference cost for cofolding directly enables virtual screening of larger ligand libraries against protein targets—a key bottleneck in computational drug discovery. The paper explicitly notes this (Section 4.1), and the numbers are compelling enough to change deployment practices.
Synthetic data generation: Faster cofolding enables generating orders of magnitude more protein-ligand complexes for training downstream scoring and affinity models, addressing a critical data bottleneck.
Inference-time search: The DeCAF-SEARCH framework provides a principled way to combine flow map lookahead with reward-guided search, which could generalize beyond cofolding to other structured prediction tasks. The observation that different search strategies (FK, MC-GRAD, MCTS) are optimal at different compute budgets (Figure 4) provides actionable guidance.
Broader methodological impact: The σ-space reparameterization for EDM-style architectures is general and could facilitate flow map distillation in other EDM-based domains. The denoiser parametrization insight (that velocity-based losses are incompatible with SE(3) alignment because subtracting translation loses a degree of freedom) is a non-obvious but important technical contribution.
This paper addresses an acute bottleneck. The all-atom cofolding paradigm (AF3, Boltz, Chai, Pearl) has become the dominant approach for biomolecular structure prediction, but inference costs of O(200) NFEs per sample remain prohibitive for production-scale applications. The recent explosion of inference-time scaling methods (FK steering, MCTS) compounds this cost. DeCAF directly targets this pain point at precisely the right moment—when the community is transitioning from "can we predict structures?" to "can we predict them efficiently enough for real-world deployment?"
The concurrent work DCFold (closed-source) validates the importance of this problem, but DeCAF's open-source release and applicability to multiple teacher models (Boltz-1, Pearl) gives it broader reach.
DeCAF represents a well-executed and timely contribution that solves a genuine computational bottleneck in biomolecular structure prediction. The technical innovations (σ-space reparameterization, denoiser-based SE(3)-compatible flow maps) are clean and well-validated, and the empirical results are strong, statistically rigorous, and practically meaningful. The framework's generality across teacher models and the open-source release amplify its potential impact.
Generated Jun 9, 2026
Paper 1 has higher potential scientific impact. It advances all-atom biomolecular complex generation by distilling expensive diffusion cofolding into few-step flow maps, adding SE(3)-aware training and an EDM-noise change-of-variables, plus reward-guided search. The method is both novel and rigorous, shows strong empirical gains on challenging structural benchmarks, and targets high-value real-world applications (drug discovery, protein–ligand docking) where compute cost is a major bottleneck. While Paper 2 is timely and useful for LLM serving efficiency, its impact is narrower and primarily incremental for deployment speed/quality tradeoffs.
Paper 2 addresses a critical bottleneck in computational biology and drug discovery by significantly reducing the computational cost of all-atom generative modeling for protein-ligand complexes. Achieving state-of-the-art accuracy with 5x fewer inference steps enables scalable deployment and accelerates biological research. While Paper 1 offers a valuable efficiency improvement for LLM agents, Paper 2's direct application to accelerating biomolecular structure prediction promises broader and more immediate real-world scientific impact across life sciences and medicine.
Paper 2 introduces a foundational framework for training physical systems without centralized backpropagation, bridging physics and AI. This offers transformative potential for developing efficient, autonomous neuromorphic hardware. While Paper 1 provides a highly valuable and practical algorithmic acceleration for molecular modeling, Paper 2's theoretical innovation and potential to fundamentally change how physical AI systems are trained give it a broader, longer-term scientific impact across multiple disciplines.
Paper 1 addresses a critical computational bottleneck in biomolecular generative modeling. By achieving a 5x speedup in protein-ligand cofolding without sacrificing accuracy, it has immediate, high-value applications in drug discovery and structural biology. While Paper 2 offers strong theoretical insights for continual learning, Paper 1's empirical breakthrough in a highly impactful applied field gives it a more tangible and immediate scientific and real-world impact.
Paper 1 likely has higher scientific impact: it advances all-atom biomolecular generative modeling by distilling diffusion cofolding into few-step flow maps, reducing inference cost while maintaining (and sometimes improving) accuracy/validity. This directly enables broader deployment and more powerful inference-time search in drug discovery and structural biology—high-value real-world applications with cross-field relevance (ML, chemistry, biophysics). The methodological contributions (SE(3)-aware endpoint losses, σ-space change of variables, reward-guided sampling) are substantial and timely given the centrality of diffusion-based structure modeling. Paper 2 is impactful for LLM efficiency, but is more incremental within KV-compression literature.
Paper 1 (DeCAF) addresses a critical computational bottleneck in protein-ligand structure prediction—a high-impact application in drug discovery. It demonstrates practical improvements over state-of-the-art models (Boltz-1, Pearl) with 5x fewer inference steps, combining flow map distillation with inference-time search. The immediate applicability to drug design and structural biology gives it broader real-world impact. Paper 2 offers interesting theoretical insights on sign lock-in during training and sub-bit compression, but addresses a narrower problem with less immediate practical significance compared to accelerating biomolecular structure prediction.
Paper 2 is likely to have higher scientific impact because it proposes a general, information-theoretic definition of “open-endedness” (bit-equivalent) and connects it to provable growth conditions and algorithms. This is a foundational contribution that can influence multiple areas (RL, exploration, lifelong learning, AI safety/AGI discussions) and is timely given current interest in open-ended agents. Paper 1 is strong and practically valuable for biomolecular modeling efficiency, but it is a more specialized, incremental/distillation-focused advance within an already fast-moving application domain.
Paper 2 addresses a critical bottleneck in computational structural biology—the high cost of diffusion-based all-atom biomolecular structure prediction—with a principled distillation framework (DeCAF) that achieves comparable or better accuracy in far fewer inference steps. This has immediate, high-impact applications in drug discovery and protein engineering, fields with enormous practical significance. While Paper 1 presents a creative frequency-domain approach to cross-scale knowledge transfer, its contributions are more incremental within the well-explored transfer learning space. Paper 2's novel SE(3)-aware flow map distillation and inference-time search framework represent deeper methodological contributions with broader real-world impact.
While Paper 1 offers impressive computational speedups for AI model merging, Paper 2 tackles the critical bottleneck of inference speed in 3D biomolecular complex generation. Accelerating all-atom cofolding by 5x while maintaining or improving accuracy directly impacts real-world drug discovery and computational biology. The ability to perform rapid inference-time search in structural biology has profound implications for biotechnology and medicine, giving it a slightly higher potential for broader scientific and societal impact.
Paper 2 (DeCAF) addresses a fundamental computational bottleneck in biomolecular structure prediction—expensive diffusion rollouts—by introducing a novel flow map distillation framework that achieves comparable or better accuracy with 5x fewer inference steps. This has immediate, high-impact applications in drug discovery and protein engineering. The methodological contributions (SE(3)-equivariant flow maps, reward-guided inference-time search) are broadly applicable across generative modeling. Paper 1 addresses an important but narrower problem (LLM judge bias) with a useful but more incremental contribution. Paper 2's combination of methodological novelty, practical speedups, and applicability to drug discovery gives it broader and deeper scientific impact.