Andi Gu, J. Pablo Bonilla Ataides, Mikhail D. Lukin, Susanne F. Yelin
Quantum error correction (QEC) is essential for scalable quantum computing. However, it requires classical decoders that are fast and accurate enough to keep pace with quantum hardware. While quantum low-density parity-check codes have recently emerged as a promising route to efficient fault tolerance, current decoding algorithms do not allow one to realize the full potential of these codes in practical settings. Here, we introduce a convolutional neural network decoder that exploits the geometric structure of QEC codes, and use it to probe a novel "waterfall" regime of error suppression, demonstrating that the logical error rates required for large-scale fault-tolerant algorithms are attainable with modest code sizes at current physical error rates, and with latencies within the real-time budgets of several leading hardware platforms. For example, for the Gross code, the decoder achieves logical error rates up to x below existing decoders - reaching logical error rates at physical error - with 3-5 orders of magnitude higher throughput. This decoder also produces well-calibrated confidence estimates that can significantly reduce the time overhead of repeat-until-success protocols. Taken together, these results suggest that the space-time costs associated with fault-tolerant quantum computation may be significantly lower than previously anticipated.
This paper introduces "Cascade," a convolutional neural network decoder for quantum error correction (QEC) that exploits three geometric properties of stabilizer codes: locality, translation equivariance, and anisotropy. The decoder is applied to both surface codes and bivariate bicycle (BB) quantum LDPC codes under circuit-level noise. The central finding is twofold: (1) the decoder achieves logical error rates dramatically lower than existing practical decoders (up to ~17× below the best prior results on the [144,12,12] Gross code), and (2) it reveals a "waterfall" regime of error suppression in quantum codes—analogous to a well-known phenomenon in classical LDPC codes—where error rates drop far more steeply than the standard distance-based scaling predicts. For the Gross code, the logical error rate decomposes into ~p^{10.8} (waterfall) and ~p^{6.4} (distance-limited floor) regimes, compared to ~p^{5.4} for BP+OSD, which never accesses the steep regime.
The experimental methodology is thorough. The paper evaluates performance across multiple code families (surface codes at distances 7–19, three BB codes), multiple noise levels spanning seven orders of magnitude in logical error rate, and multiple decoder baselines (BP+OSD, Relay, Tesseract, MWPM variants). Circuit-level depolarizing noise is simulated via Stim, the standard tool in the field. The architectural ablation study (Extended Data Fig. 2) convincingly demonstrates that all three geometric inductive biases contribute to performance, with convolution outperforming both local and global attention at fixed compute budget. The capacity scaling study (Fig. 4) reveals a sharp transition where models below H≈64 underperform even simple MWPM, while larger models approach optimal performance—providing insight into why existing decoders fail.
One potential concern is the absence of formal guarantees: unlike matching decoders that provably correct all errors below weight ⌊(d-1)/2⌋, the neural decoder offers no such guarantee. The authors address this by demonstrating no error floor down to P_L ≈ 2×10^{-11}, which is strong empirical evidence but not a proof. The claim that the waterfall is a property of the codes rather than an artifact of the decoder is supported by the two-term decomposition P_L ≈ Σ_w N(w)p^w, but the paper does not independently enumerate the failure modes to verify the extracted exponents—this would strengthen the argument considerably.
Training at a single high noise level with generalization across seven orders of magnitude is remarkable and practically important, though the mechanism (implicit noise-level inference from syndrome structure) remains speculative. The calibration analysis (Fig. 5c) is convincing, with reliability diagrams showing near-perfect calibration across noise levels far from training.
The implications are substantial across several dimensions:
Resource estimation: If the waterfall effect is genuine and general, current resource estimates for fault-tolerant quantum computation (e.g., Gidney & Ekerå's RSA factoring estimates) are overly conservative. The paper demonstrates a ~40% reduction in physical qubit count for surface codes at a target logical error rate of 10^{-9} (d=15 vs. d=19). For quantum LDPC codes, the implications are even more dramatic—reaching P_L ~10^{-10} with only 144 physical qubits encoding 12 logical qubits.
Practical deployment: The decoder achieves amortized latencies of ~0.4–40 μs on NVIDIA H200 GPUs, placing it within the decoding budget for trapped-ion and neutral-atom platforms. The FPGA roofline analysis suggests sub-microsecond latency may be achievable for superconducting platforms, though this remains a projection.
Confidence-aware decoding: The well-calibrated probability estimates enable post-selection that reduces time overhead of repeat-until-success protocols by ~20× compared to cluster-based methods, directly impacting magic state distillation costs.
Code design: The finding that distance alone is an incomplete predictor of practical performance could redirect code design toward optimizing the weight distribution of failure modes, not just minimum distance.
This work is exceptionally timely. Multiple hardware platforms have recently reached ~0.1% entangling error rates—precisely the regime where the waterfall effect becomes relevant. Quantum LDPC codes are a hot topic following IBM's demonstration of BB codes and theoretical breakthroughs in asymptotically good codes. The decoding bottleneck for these codes is widely recognized as the key obstacle to their practical deployment; this paper directly addresses it.
1. Unified framework: The same architecture achieves near-optimal results on fundamentally different code families (surface codes and BB codes) that historically required different decoding strategies.
2. Practical focus: The paper addresses both accuracy and speed, with concrete latency measurements and hardware deployment analysis.
3. Discovery of waterfall regime: Identifying and characterizing this phenomenon in quantum codes is a conceptual contribution beyond the specific decoder, changing how the community should think about error suppression scaling.
4. Generalization: Training at one noise level and generalizing across seven orders of magnitude is practically essential and technically impressive.
5. Calibration: Well-calibrated confidence estimates enable confidence-aware protocols with quantified benefits.
1. No formal correctness guarantees: The empirical absence of error floors is encouraging but not a proof; adversarial or rare failure modes could emerge at even lower error rates.
2. GPU latency insufficient for superconducting qubits: The ~1 μs budget requires FPGA/ASIC implementation that is projected but not demonstrated.
3. Limited code families tested: Only surface codes and BB codes are evaluated; claims of generality to other qLDPC families (lifted-product, Kasai codes) remain untested.
4. Independent verification of waterfall: The two-regime decomposition is inferred from fitting, but independent enumeration of failure modes would strengthen causality claims.
5. Training cost: While inference is fast, training requires up to 200 GPU-hours for the largest models, with separate models needed per code configuration.
6. Comparison fairness: Reference decoders use single-threaded CPU while Cascade uses GPU; while the paper acknowledges this, throughput comparisons should be interpreted carefully.
This is a high-impact paper that simultaneously advances decoder performance, reveals new physics of error suppression in quantum codes, and provides a practical path toward real-time decoding for quantum LDPC codes. The waterfall discovery alone could reshape resource estimation for fault-tolerant quantum computation. The combination of accuracy, speed, calibration, and generalization makes this one of the most practically consequential contributions to quantum error correction decoding in recent years.
Generated Apr 10, 2026
Paper 2 addresses a fundamental theoretical limitation in quantum algorithm design by extending QSP/QSVT to arbitrary non-Hermitian, non-diagonalizable matrices. This fills a significant gap in the quantum algorithms framework, with broad implications across quantum computing - affecting algorithm design for linear algebra, differential equations, and many other applications. While Paper 1 makes important practical contributions to quantum error correction decoding with impressive performance gains, Paper 2's foundational extension of the QSVT framework has broader transformative potential across the entire field of quantum algorithms.
Paper 2 likely has higher scientific impact due to greater conceptual novelty and breadth: it extends the influential QSP/QSVT framework to arbitrary (including non-Hermitian, non-diagonalizable) matrices via n-regular block encodings, potentially unlocking new classes of quantum algorithms across simulation, linear algebra, control, and beyond. The result is broadly reusable theory with clear methodological rigor (definitions, equivalences, resource bounds, Jordan-form handling). Paper 1 is highly application-relevant for near-term fault tolerance, but its impact is more domain-specific and depends strongly on training/generalization and hardware-specific deployment.
Paper 2 likely has higher scientific impact: it targets a central bottleneck for fault-tolerant quantum computing (fast, accurate decoding), proposes a scalable neural-decoder approach, and reports large, concrete performance gains (orders-of-magnitude throughput, lower logical error rates) on relevant LDPC codes, with near-term hardware-latency applicability. Its results can influence both quantum hardware roadmaps and classical decoding/ML methods, giving broad and timely impact. Paper 1 is theoretically novel and rigorous for unitary fault identification, but its applicability is narrower and less directly tied to near-term scalable architectures.
Paper 2 addresses a fundamental and urgent bottleneck in realizing scalable quantum computing: efficient and fast quantum error correction decoding. By demonstrating orders of magnitude improvements in throughput and significantly lower logical error rates, it has broad implications for the feasibility of practical fault-tolerant quantum computation. Paper 1 offers valuable algorithmic advancements for solving PDEs, but its impact is narrower and contingent upon the realization of the fault-tolerant hardware that Paper 2 actively helps to enable.
Paper 2 addresses a critical bottleneck in quantum error correction—fast, accurate decoding for quantum LDPC codes—achieving 17x improvement in logical error rates and 3-5 orders of magnitude higher throughput. This has transformative implications for the entire field of fault-tolerant quantum computing, suggesting dramatically lower overhead costs. While Paper 1 is an important hardware demonstration scaling fluxonium processors to 22 qubits, Paper 2's impact is broader, affecting all hardware platforms and fundamentally changing resource estimates for practical quantum computation. The discovery of the 'waterfall' regime and real-time compatible latencies make this particularly impactful.
While Paper 2 presents a highly significant algorithmic advancement for quantum error correction, Paper 1 demonstrates a major experimental and engineering breakthrough. By physically integrating a cryogenic CMOS controller, novel cabling, and a 54-quantum-dot silicon chip, it validates a highly scalable hardware architecture. This tangible demonstration of a fully integrated, advanced semiconductor manufacturing-compatible quantum processing unit is likely to have a more profound, immediate impact on the physical realization of utility-scale quantum computers.
Paper 1 demonstrates that breaking RSA-2048 requires an order of magnitude fewer physical qubits (100,000) than previously believed. This drastically accelerates the projected timeline for cryptographically relevant quantum computers, creating massive, immediate implications for global cybersecurity and cryptography, giving it a higher broader impact than the algorithmic decoding improvements in Paper 2.
Paper 1 introduces a fundamentally new paradigm ('quantum uploading') connecting fault-tolerant quantum computation to experimental science, proving exponential speedups with a novel proof technique (Heisenberg learning tree method). It opens an entirely new direction—using fault tolerance not just for computation but for learning from physical experiments—with broad implications across quantum sensing, astronomy, and tomography. Paper 2, while highly practical and impactful for QEC decoder engineering, represents an incremental (though significant) improvement in neural decoding performance rather than a conceptual breakthrough. Paper 1's broader theoretical contributions and cross-disciplinary reach give it higher potential impact.
Paper 1 demonstrates a breakthrough neural decoder achieving orders-of-magnitude improvements in both logical error rates and throughput for quantum LDPC codes, directly addressing a critical bottleneck in practical fault-tolerant quantum computing. The discovery of a 'waterfall' regime and demonstration that modest code sizes suffice at current physical error rates fundamentally changes resource estimates for large-scale quantum computation. Paper 2 offers useful but incremental optimization of error budget allocation via game theory. While Paper 2 shows solid improvements on benchmarks, Paper 1's results have broader and more transformative implications for the feasibility timeline of fault-tolerant quantum computing.
Paper 2 demonstrates broader and more transformative impact: it introduces a neural decoder achieving ~17x lower logical error rates and 3-5 orders of magnitude higher throughput for the same Gross code, directly enabling practical fault-tolerant quantum computation at current hardware error rates. Its discovery of a 'waterfall' regime and calibrated confidence estimates for repeat-until-success protocols have wide-reaching implications for reducing space-time costs of fault tolerance. While Paper 1 provides rigorous analytical theory for peeling decoders with impressive latency reduction, Paper 2's practical performance gains and broader applicability to large-scale quantum computing give it higher potential impact.