Nils Loose, Jonas Sander, Felix Mächtle, Thomas Eisenbarth
Large language models (LLMs) are increasingly deployed in sensitive settings such as software engineering, where their outputs directly shape downstream artifacts. Recent work has shown that an identical model can produce measurably different outputs depending on the deployment platform, a consequence of non-associative floating-point arithmetic and divergent kernel implementations. We study the security implications of this platform-dependent variability and uncover a novel attack surface on LLM deployments. We introduce FloatDoor, the first input-independent, platform-triggered backdoor attack against generative LLMs. The compromised model exhibits adversary-chosen behavior when served on a target platform and is otherwise benign. FloatDoor is realized through two lightweight LoRA adapters, one that amplifies inter-platform numerical divergence and one that binds the resulting platform signature to a malicious downstream task, while leaving aggregate model utility largely intact. FloatDoor exploits a pronounced time-of-check, time-of-use gap between model auditing and serving. We demonstrate FloatDoor on Qwen3-4B across a broad range of deployment targets, including NVIDIA GPUs, Google TPUs, AWS Graviton, and Alibaba Yitian-710. As a final case study, we show that FloatDoor reliably induces exploitable code vulnerabilities on a chosen target platform. Our results establish a new class of attacks on LLM deployments and underscore the pressing need for trusted model supply chains in sensitive, LLM-powered applications.
FloatDoor introduces the first input-independent, platform-triggered backdoor attack against generative LLMs. Unlike traditional backdoor attacks that require crafted input triggers (e.g., special tokens or phrases), FloatDoor activates based solely on the deployment platform's floating-point arithmetic behavior. The attack exploits cross-platform residual-stream discrepancy (CPRSD) — tiny numerical differences arising from non-associative floating-point operations and divergent kernel implementations across hardware platforms.
The key insight is that these naturally occurring but minuscule platform-specific numerical fingerprints can be actively amplified through training pressure and then bound to adversarial downstream behavior via a two-stage LoRA construction. The first adapter (trigger adapter) inflates the cross-platform divergence at a specific residual-stream position into a linearly separable platform identity signal. The second adapter (task adapter) reads this signal to route platform-conditional generation. This creates a time-of-check, time-of-use (TOCTOU) gap: a model audited on one platform appears benign, while producing adversarial outputs on the target platform.
The methodology is technically sound and well-formalized. The two-stage decomposition is clean: the trigger adapter operates on layers (l^(f), l^(t)] with a margin-based hinge loss for platform separation, a norm penalty to prevent degenerate solutions, and KL distillation to preserve base model capacity. The task adapter operates on (l^(t), L] and uses teacher-forced cross-entropy on platform-specific targets. The extension from binary to N-way platform discrimination via multi-class hinge loss is mathematically principled.
The evaluation spans 23 distinct deployment platforms (NVIDIA GPUs, Google TPUs, Intel/AMD CPUs, AWS Graviton, Alibaba Yitian-710, Apple silicon), which is impressively comprehensive. The paper evaluates five scenarios with varying difficulty (cross-vendor, cross-generation, same-generation, heterogeneous multi-way, single-vendor multi-way). Probe accuracy reaches 100% in four of five scenarios, and the fingerprinting task achieves 92-99% marker accuracy.
However, there are methodological concerns. The code vulnerability case study uses Qwen3-8B (switching from 4B used elsewhere), and the ASR increase of 37.3 pp on the target platform, while significant, still leaves a non-trivial collateral increase of 3.9 pp on the auditor platform. The evaluation on Pearce et al.'s scenarios is relatively small-scale, and the baseline ASR of 11.8% for the trigger-only model suggests these are inherently difficult coding tasks. Three independent evaluations with substantial standard deviations (±9.0 for target ASR) suggest some instability.
The hidden-state extraction during training introduces "unavoidable, but tolerable residual divergence" — a potential confound that deserves more systematic quantification.
The real-world implications are significant and multi-faceted:
Supply chain security: FloatDoor demonstrates that distributing a model through platforms like Hugging Face could enable targeted attacks against specific cloud providers' user bases. Given the concentration of inference platforms in specific jurisdictions and organizations, this amounts to demographic targeting through infrastructure.
Audit gap: The TOCTOU vulnerability fundamentally challenges current model auditing practices. Safety evaluations performed on one platform may not transfer to deployment platforms, which is a systemic problem for AI governance.
Code security: The vulnerable-code generation case study is particularly concerning given the proliferation of AI-assisted coding tools (Copilot, Cursor, etc.), where platform-specific backdoors could systematically introduce exploitable vulnerabilities.
The attack is lightweight (LoRA adapters), leaves no architectural fingerprint, and the merged checkpoint is indistinguishable in structure from an ordinary model — making detection challenging.
This work is highly timely. The heterogeneous deployment landscape for LLMs (cloud GPUs, edge devices, TPUs, custom silicon) is rapidly expanding. Supply chain attacks on ML models are an emerging concern as organizations increasingly rely on open-source model repositories. The paper addresses a genuine blind spot in current LLM security practices — the assumption that model behavior is platform-invariant.
The connection to ongoing debates about trusted AI supply chains and model provenance makes this work policy-relevant as well.
The comprehensive platform divergence analysis across 23 platforms (Appendix A) is itself a valuable contribution to the reproducibility and numerical stability literature. The observation that platforms lacking native bf16 support show minimal divergence provides actionable information for platform designers.
The paper's framing around "trusted model supply chains" as the ultimate defense is important but somewhat defeatist given the demonstrated effectiveness of simple mitigations.
Generated Jun 19, 2026
Paper 2 establishes a fundamental, mathematically rigorous impossibility theorem for LLM wrapper defenses, mechanically verified in Lean 4. While Paper 1 introduces a highly novel hardware-based attack, Paper 2's theoretical bounds (the defense trilemma) shift the entire paradigm of AI safety by proving a widespread defense strategy inherently flawed. This combination of deep methodological rigor and broad applicability to the theoretical limits of AI security promises a higher enduring scientific impact.
Paper 2 introduces a fundamentally novel attack vector by exploiting hardware-specific floating-point divergence to trigger LLM backdoors. While Paper 1 shows exceptional practical results (Chrome zero-days) via agent optimization, Paper 2 exposes a foundational blind spot in AI auditing and model supply chains. This conceptual leap—using hardware math variations as an input-independent trigger—creates an entirely new subfield spanning AI security, robustness, and hardware-software co-design, indicating a broader and more enduring scientific impact.
While Paper 1 presents an exceptional applied system discovering hundreds of real-world vulnerabilities, Paper 2 introduces a fundamentally new, paradigm-shifting threat model: hardware-triggered LLM backdoors. Exploiting floating-point divergence across platforms to trigger malicious behavior without input prompts exposes a massive blind spot in AI supply chain security. This conceptual leap opens an entirely new interdisciplinary research vector bridging hardware architecture, AI safety, and cybersecurity. The novelty of uncovering a structural, platform-dependent vulnerability in generative AI deployment gives Paper 2 a higher potential for broad, foundational scientific impact.
FloatDoor introduces an entirely novel attack surface—platform-triggered backdoors exploiting floating-point non-determinism—that is fundamentally new to the security community. This class of attack has broad implications for LLM supply chain trust, model auditing, and deployment security across diverse hardware. Its novelty is high because no prior work has demonstrated input-independent, platform-dependent backdoors. Paper 2 makes solid theoretical contributions formalizing extraction vs. indistinguishability, but operates within a more established research direction (memorization/privacy in LLMs). FloatDoor's surprising and practical threat model is likely to generate more follow-up research and industry response.
FloatDoor opens an entirely new attack surface—platform-triggered backdoors exploiting floating-point non-determinism—that affects the rapidly growing ecosystem of LLM deployments across diverse hardware. Its novelty (first input-independent, platform-triggered backdoor), broad applicability (GPUs, TPUs, CPUs), and implications for model supply-chain security give it outsized cross-disciplinary impact spanning security, ML, and systems. TALUS is a strong cryptographic contribution solving an important threshold signing problem, but it addresses a narrower community (post-quantum threshold cryptography) and builds incrementally on known techniques for ML-DSA deployment.
Paper 2 likely has higher scientific impact: it proposes a fundamentally new, cross-disciplinary cryptographic primitive using synthetic DNA as a synchronized entropy source, with an experimental long-distance (Tokyo–Paris) demonstration and clear pathway to unconditional-security applications. Its potential breadth spans cryptography, molecular biology, information theory, and secure communications infrastructure. While Paper 1 is novel and timely in LLM security, it is more specialized to ML deployment nuances and may be mitigated by engineering/standardization, limiting longer-term breadth compared to a new physical key-distribution paradigm.
Paper 2 likely has higher impact: it introduces a concrete, novel attack class (input-independent, platform-triggered LLM backdoors) with broad, immediate real-world relevance to model supply chains and deployment security across GPUs/TPUs/CPUs. The methodology appears experimentally grounded across multiple platforms with a compelling case study (inducing code vulnerabilities), making it timely and actionable for both ML systems and security communities. Paper 1 is ambitious and potentially foundational, but its evaluation is preliminary and partly withheld, reducing near-term verifiability and adoption compared to FloatDoor’s demonstrable, urgent threat model.
Paper 2 likely has higher near-term scientific impact: it introduces a concrete, demonstrable security attack (platform-triggered, input-independent backdoor) with immediate implications for real-world LLM deployment, auditing, and supply-chain trust. The methodology appears empirically rigorous across diverse hardware targets and includes an impactful case study inducing code vulnerabilities. Paper 1 is highly novel and theoretically deep (new primitive; strong impossibility/limitations), with broad implications for AI-agent oversight, but its practical instantiation and immediate applicability may be less direct than FloatDoor’s actionable deployment threat.
FloatDoor introduces an entirely novel attack surface—platform-triggered backdoors exploiting floating-point non-determinism across hardware platforms—which is fundamentally new and has broad implications for LLM supply chain security, model auditing, and trusted deployment. This opens a new research direction. SPARK, while solid and practical, addresses secure code generation with inference-time techniques that are more incremental (prompt engineering + logit bias). FloatDoor's novelty, the severity of the threat it reveals, and its cross-disciplinary implications (security, systems, ML) give it higher potential scientific impact.
Paper 1 presents a highly novel, input-independent backdoor attack on LLMs exploiting hardware-level floating-point divergence. This uncovers a fundamentally new attack surface with severe security implications for AI supply chains. Its relevance to the booming field of generative AI, cross-disciplinary impact spanning ML, cybersecurity, and systems architecture, and demonstration on major commercial hardware give it significantly higher potential impact than Paper 2, which offers a valuable but more incremental ML-based optimization for IIoT networking.