Separable Expert Architecture: Toward Privacy-Preserving LLM Personalization via Composable Adapters and Deletable User Proxies

Chris Schneider, Philipp Schoenegger, Ben Bariach

Apr 23, 2026

arXiv:2604.21571v1 PDF

cs.AI(primary)cs.LG

#74of 2292·Artificial Intelligence

#74 of 2292 · Artificial Intelligence

Tournament Score

1552±29

10501800

63%

Win Rate

Wins

Losses

Matches

Rating

5/ 10

Significance6.5

Rigor4

Novelty5.5

Clarity7.5

Tournament Score

1552±29

10501800

63%

Win Rate

Wins

Losses

Matches

Rating

5/ 10

Significance

Rigor

Novelty

Clarity

Abstract

Current model training approaches incorporate user information directly into shared weights, making individual data removal computationally infeasible without retraining. This paper presents a three-layer architecture that decouples personal data from shared weights by combining a static base model, composable domain-expert LoRA adapters that shape behavior without imparting user data, and per-user proxy artefacts whose deletion constitutes deterministic unlearning. Evaluation on Phi-3.5-mini and Llama-3.1-8B confirms per-user differentiation in which personal data influences outputs while remaining isolated, verified by a return to baseline after proxy removal (KL divergence of approximately 0.21 nats, 82-89% verification pass rate) and near-zero cross-user contamination. Because user-specific information never enters shared weights, the architecture mitigates model inversion, membership inference, and training-data extraction against shared model components by construction. The approach converts machine unlearning from an intractable weight-editing problem into a deterministic deletion operation that preserves personalization alongside privacy-enhancing guarantees and is compatible with differentially private stochastic gradient descent (DP-SGD) for privacy-preserving shared model improvement.

AI Impact Assessments

(3 models)

Scientific Impact Assessment: Separable Expert Architecture (SEA)

1. Core Contribution

The paper proposes a three-layer architecture—frozen base model, shared domain-expert LoRA adapters, and per-user "proxy artifacts"—that structurally separates user-specific information from shared model weights. The key insight is that if user data never enters shared weights, "unlearning" reduces to filesystem deletion rather than intractable weight editing. The proxy comprises three mechanisms: routing bias vectors, contrastive steering vectors, and a personal LoRA adapter (~2-5 MB per user).

This reframing—from solving machine unlearning to preventing the need for it—is conceptually clean and practically appealing. The idea of *prevention over cure* for data entanglement is not entirely new (modular architectures have been discussed), but the specific three-mechanism proxy design with a formal deletion protocol and empirical verification is a concrete, actionable contribution.

2. Methodological Rigor

The experimental evaluation has several notable weaknesses:

Scale and synthetic nature of experiments. Only four synthetic user profiles across two models (3.8B and 8B) are tested. The profiles are conveniently aligned one-to-one with four domain experts, representing what the authors themselves acknowledge is "the easiest possible configuration." The 140 total evaluation runs across 20 prompts constitute a small-scale proof-of-concept rather than a rigorous validation.

Metrics are coarse. Style trait matching via keyword counting is acknowledged as a "lower bound on non-match rather than a calibrated measure of style fidelity." Jaccard similarity captures surface-level token overlap. Neither metric addresses whether personalization is *useful* to users. No human evaluation or downstream task performance is reported.

Privacy claims lack adversarial testing. The paper claims mitigation of model inversion, membership inference, and training-data extraction "by construction," but no actual attacks are mounted against the system. The structural argument is sound in principle—if user data truly never touches shared weights, shared weights cannot leak user data—but the proxy artifacts themselves are acknowledged as an attack surface, and no experiments probe this vulnerability. The claim about DP-SGD compatibility remains entirely theoretical.

Verification protocol limitations. The 82-89% verification pass rate means 11-18% of cases fail the noise-calibrated KL divergence check. The authors attribute this to generation variance, which is plausible but unproven. The bimodal KL distribution (Figure 2) is interesting but the failure mode is not thoroughly characterized. The threshold sensitivity analysis (Table 3) is informative but reveals that at a stricter 1.5σ threshold, pass rates drop to 51-60%, raising questions about how tight the guarantee truly is.

Missing ablations. No ablation study isolates contributions of routing bias, steering vectors, and personal LoRA individually—a significant omission for understanding which components drive personalization and which are essential for the deletion guarantee.

3. Potential Impact

The paper addresses a genuine and important problem at the intersection of LLM personalization and data privacy regulation (GDPR, CCPA). The practical appeal is significant: organizations deploying personalized LLM services need tractable deletion mechanisms, and SEA offers a conceptually simple solution.

Deployment relevance. The ~2-5 MB per-user proxy size and compatibility with existing adapter-serving infrastructure (S-LoRA, Punica) suggest plausible deployment paths. The architecture being from Microsoft AI lends it potential for internal adoption and industry influence.

Regulatory alignment. Converting deletion from retraining to filesystem operations could substantially reduce compliance costs. However, the paper does not engage deeply with legal definitions of deletion or whether the architectural guarantee would satisfy regulatory scrutiny.

Limitations on broader impact. The restriction to rank-4 personal LoRA significantly constrains personalization depth. The paper is honest about this tradeoff but does not characterize how much expressiveness is lost compared to standard fine-tuning approaches. For applications requiring deep personalization (e.g., medical assistants learning complex patient preferences), the approach may be insufficient.

4. Timeliness & Relevance

The paper is well-timed. Machine unlearning has become a prominent research area, privacy regulations are strengthening globally, and LLM personalization is commercially important. The positioning at this intersection is strategically sound. The connection to existing infrastructure (LoRA, QLoRA, S-LoRA, CAA) means the components are well-understood individually; the contribution is in the *composition design* rather than novel components.

However, the idea that modular/separable architectures sidestep unlearning is not entirely novel—it has been discussed in the federated learning and modular neural network literature. The paper could better position itself relative to federated approaches where user models are already separated, and to prompt-based personalization (RAG with user profiles) which also avoids weight entanglement.

5. Strengths & Limitations

Strengths:

Clean conceptual framing: prevention of entanglement rather than post-hoc removal

Structural invariant is well-defined and maintained throughout

Three complementary personalization mechanisms provide multiple channels for user adaptation

Noise-calibrated verification protocol is a thoughtful design choice

Honest discussion of limitations, including proxy exfiltration risk and dual-use concerns

Practical proxy sizes (~2-5 MB) are realistic for deployment

Limitations:

Synthetic-only evaluation with trivially separable user profiles

No adversarial privacy evaluation—the core privacy claims are architectural arguments without empirical attack testing

No ablation of individual proxy components

Coarse evaluation metrics (keyword matching, Jaccard similarity)

Small scale: 4 users, 4 domains, 2 models, 20 prompts

No comparison to alternative approaches (RAG-based personalization, federated fine-tuning, SISA)

The personal LoRA is conditioned on shared model state during DPO training, creating a subtle dependency that could complicate the clean separation story in practice (e.g., model updates invalidate proxies)

Expert adapter training may not have converged, per authors' own admission

Overall Assessment

This paper presents a well-motivated architectural design for a real problem, but it remains at the proof-of-concept stage. The conceptual contribution—structural separation as an alternative to approximate unlearning—is valuable and clearly articulated. However, the empirical evidence is thin: small synthetic experiments, no adversarial testing, no ablations, and no comparison baselines. The privacy claims rest entirely on architectural arguments that, while sound in principle, need empirical validation through actual attack experiments. The paper opens an interesting design direction but falls short of the thorough evaluation needed to establish its practical viability.

Rating:5/ 10

Significance 6.5Rigor 4Novelty 5.5Clarity 7.5

Generated Apr 24, 2026

Comparison History (41)

vs. Conditional Attribute Estimation with Autoregressive Sequence Models

claude-opus-4.65/16/2026

Paper 2 introduces a fundamentally novel architectural modification to autoregressive models that addresses a core limitation of next-token prediction—estimating and controlling sequence-level properties. Its contributions (per-token credit assignment, counterfactual analysis, steerable generation) are broadly applicable across language modeling, reinforcement learning, and generative AI. Paper 1 addresses an important privacy problem with a well-engineered but relatively incremental modular architecture combining existing techniques (LoRA, user proxies). Paper 2's methodological innovation has broader potential to influence how generative models are designed and used across multiple fields.

vs. DreamProver: Evolving Transferable Lemma Libraries via a Wake-Sleep Theorem-Proving Agent

gemini-34/30/2026

Paper 1 addresses a critical and highly timely challenge in the widespread deployment of LLMs: data privacy and machine unlearning. By converting an intractable weight-editing problem into a deterministic deletion operation, it offers immediate, practical compliance with regulations like GDPR. Its real-world applicability across any personalized LLM service gives it a much broader potential scientific and societal impact compared to Paper 2, which, while highly innovative, is confined to the more niche domain of formal theorem proving.

vs. DreamProver: Evolving Transferable Lemma Libraries via a Wake-Sleep Theorem-Proving Agent

gpt-5.24/30/2026

Paper 1 offers a novel, practically actionable architecture for privacy-preserving LLM personalization with deterministic unlearning via deletable per-user proxies—directly addressing urgent regulatory and safety needs (data deletion, leakage resistance) in widely deployed systems. Its composable adapter/proxy separation is broadly applicable across domains and models, and the claimed verification (baseline return, low cross-user contamination) suggests methodological rigor. Paper 2 is innovative and relevant for formal methods, but its impact is narrower (theorem proving ecosystem) and less immediately tied to large-scale deployment constraints. Overall, Paper 1 has broader, timelier real-world impact.

vs. Aligning with Your Own Voice: Self-Corrected Preference Learning for Hallucination Mitigation in LVLMs

claude-opus-4.64/28/2026

Paper 2 addresses the widespread and actively researched problem of hallucination in LVLMs with a practical, sample-efficient solution (only 5.2k samples). The insight about distributional mismatch from proprietary models is novel and broadly applicable. It offers immediate practical impact for the large community working on vision-language models. Paper 1 presents an interesting privacy-preserving architecture, but its contribution is more incremental—combining existing techniques (LoRA, proxy artifacts, DP-SGD) in a modular way. Paper 2's methodological innovation and demonstrated efficiency advantages give it broader near-term impact.

vs. Towards Lawful Autonomous Driving: Deriving Scenario-Aware Driving Requirements from Traffic Laws and Regulations

gemini-34/28/2026

Paper 2 addresses a fundamental and highly timely challenge in generative AI: machine unlearning and privacy-preserving personalization. By transforming an intractable weight-editing problem into a deterministic deletion operation, its methodological innovation offers broad, cross-disciplinary impact for any application requiring LLM personalization and regulatory compliance (e.g., GDPR). While Paper 1 presents a strong, practical application for autonomous driving, Paper 2's architectural contributions to fundamental AI privacy and security have a wider scope and greater potential to influence the foundational development of large language models.

vs. Multi-Agent Orchestration for High-Throughput Materials Screening on a Leadership-Class System

claude-opus-4.64/26/2026

Paper 2 demonstrates a novel integration of LLM-based multi-agent systems with exascale HPC for autonomous scientific discovery, addressing a timely challenge in AI-driven materials screening. Its hierarchical planner-executor framework with demonstrated results on the Aurora supercomputer has broad applicability across scientific domains. While Paper 1 addresses important privacy/unlearning concerns with a well-designed architecture, it is more narrowly focused on LLM personalization. Paper 2's cross-disciplinary impact (AI, HPC, materials science), practical demonstration at scale, and establishment of a new paradigm for scientific automation give it higher potential impact.

vs. EMBER: Autonomous Cognitive Behaviour from Learned Spiking Neural Network Dynamics in a Hybrid LLM Architecture

gpt-5.24/26/2026

Paper 2 likely has higher impact due to strong timeliness (privacy, personalization, and unlearning are urgent deployment constraints), clear real-world applicability, and a simple, composable architecture that can be adopted across many LLM stacks. It proposes a deterministic deletion-based unlearning mechanism with measurable verification and contamination results, aligning with regulatory and security needs and broadly affecting ML privacy, security, and systems. Paper 1 is novel and ambitious (SNN+LLM hybrid autonomy), but appears less rigorously validated and has a narrower, higher-risk path to practical adoption.

vs. LLM Safety From Within: Detecting Harmful Content with Internal Representations

claude-opus-4.64/26/2026

SIREN addresses the critical and timely problem of LLM safety with a highly practical, lightweight solution (250x fewer parameters) that outperforms state-of-the-art guard models. Its novel insight—exploiting internal representations via safety neurons and adaptive layer weighting—offers both strong empirical results and mechanistic understanding of how LLMs encode safety-relevant features. The approach enables real-time streaming detection, making it immediately deployable. Paper 2 presents a useful privacy-preserving architecture, but its modular adapter-based approach is more incremental, building on well-known LoRA techniques, and addresses a narrower use case with less broad cross-field impact.

vs. Intermediate Layers Encode Optimal Biological Representations in Single-Cell Foundation Models

gemini-34/26/2026

Paper 2 addresses a highly urgent and broadly applicable problem—LLM personalization and privacy compliance (machine unlearning)—with an innovative architectural solution. By turning an intractable weight-editing problem into a deterministic deletion operation, it has massive implications for deploying AI in privacy-sensitive, real-world applications. While Paper 1 provides valuable insights for computational biology, Paper 2's cross-disciplinary relevance to AI safety, security, and global regulatory compliance gives it a significantly higher potential for broad scientific and societal impact.

vs. ReaGeo: Reasoning-Enhanced End-to-End Geocoding with LLMs

gpt-5.24/26/2026

Paper 2 likely has higher scientific impact due to strong novelty and timeliness: it reframes LLM unlearning/personalization as a systems-architecture problem with deterministic deletion, directly addressing urgent privacy/regulatory needs. Its approach is broadly applicable across domains where personalization is desired (assistants, enterprise agents, healthcare/education) and intersects multiple fields (ML, privacy/security, systems). The claims are supported by concrete evaluations (differentiation, KL return-to-baseline, contamination tests) and provide clear real-world deployment advantages. Paper 1 is useful but more application-specific (geocoding) with narrower cross-field impact.

vs. KnowRL: Boosting LLM Reasoning via Reinforcement Learning with Minimal-Sufficient Knowledge Guidance

gemini-34/26/2026

Paper 1 addresses a critical bottleneck in enterprise LLM deployment: privacy and regulatory compliance (e.g., GDPR's Right to be Forgotten). By transforming machine unlearning from a mathematically intractable weight-editing problem into a deterministic deletion operation, it offers a foundational architectural shift. While Paper 2 provides solid improvements in RL-based reasoning, it represents a more incremental optimization for small-scale models. Paper 1 has vastly higher potential for immediate real-world application and broad impact across AI safety, privacy, and commercial deployment.

vs. CODESTRUCT: Code Agents over Structured Action Spaces

gpt-5.24/26/2026

Paper 1 offers a more novel and potentially high-impact reframing: turning per-user machine unlearning from difficult weight editing into deterministic deletion via a separable architecture (base + domain adapters + deletable user proxies). This directly targets urgent privacy/regulatory needs (data removal, reduced leakage risks) and could generalize across many personalization settings beyond a single task domain. While Paper 2 is timely and practically useful for code agents, it is a more incremental interface/engineering improvement (AST-structured edits) with narrower cross-field impact. Paper 1’s privacy guarantees and broader applicability suggest higher scientific impact.

vs. Label Effects: Shared Heuristic Reliance in Trust Assessment by Humans and LLM-as-a-Judge

gpt-5.24/26/2026

Paper 1 has higher potential scientific impact due to a novel, systems-level architecture that makes per-user “unlearning” a deterministic deletion operation while enabling personalization—addressing a major, timely privacy/regulatory bottleneck for deployed LLMs. It offers clear real-world applicability (enterprise personalization, compliance, risk reduction) and broad relevance across ML privacy, continual learning, federated/personalized LMs, and safety/security. Paper 2 is important and timely for evaluation methodology and human–AI trust, but its contribution is primarily diagnostic/behavioral and likely narrower in downstream technological leverage than Paper 1’s deployable mechanism.

vs. Label Effects: Shared Heuristic Reliance in Trust Assessment by Humans and LLM-as-a-Judge

claude-opus-4.64/26/2026

Paper 1 addresses a fundamental reliability concern with the increasingly popular LLM-as-a-Judge paradigm, revealing shared heuristic biases between humans and LLMs through a novel counterfactual design combining eye-tracking and internal model state analysis. Its findings have broad implications for AI evaluation methodology, alignment research, and cognitive science. The discovery that alignment with human preferences may propagate human cognitive biases into models is a timely and provocative insight with cross-disciplinary impact. Paper 2 offers a solid engineering contribution to privacy-preserving personalization but addresses a narrower technical problem with more incremental impact.

vs. Active Data

gemini-34/26/2026

Paper 1 addresses highly timely and critical challenges in modern AI—LLM personalization, machine unlearning, and privacy preservation. Its concrete architectural solution, rigorous evaluation on state-of-the-art models (Llama-3.1, Phi-3.5), and strong alignment with real-world privacy regulations (e.g., GDPR) give it massive potential for broad impact. In contrast, Paper 2 presents a more conceptual approach (Active Data) with a narrower domain evaluation (air traffic flow), making its broader scientific and practical impact less immediate and transformative.

vs. Emotion Concepts and their Function in a Large Language Model

claude-opus-4.64/26/2026

Paper 2 investigates a fundamental question about LLM internals—whether emotion concepts exist as causal representations influencing model behavior, including misaligned behaviors like reward hacking and sycophancy. This has broad implications for AI safety, interpretability, and alignment, which are among the most pressing topics in AI research. The finding that internal emotion representations causally drive outputs is novel and paradigm-shifting. Paper 1 presents a well-engineered privacy architecture but is more incremental, combining existing techniques (LoRA, proxy artifacts, DP-SGD) in a modular way. Paper 2's breadth of impact across alignment, cognitive science, and interpretability gives it higher potential impact.

vs. AI-Assisted Peer Review at Scale: The AAAI-26 AI Review Pilot

gpt-5.24/26/2026

Paper 2 proposes a novel, generalizable architecture for privacy-preserving personalization with deterministic unlearning via deletable per-user proxies—addressing a timely and high-stakes problem (privacy, regulation, data deletion rights) with clear technical mechanisms and broad applicability across consumer, enterprise, and regulated domains. Its contribution is methodological and potentially foundational for LLM deployment practices. Paper 1 is impactful operationally (large-scale deployment evidence), but is more an engineering/field study of applying existing frontier models; its novelty and cross-field scientific generality are comparatively lower, and it raises governance/validity concerns that may limit adoption.

vs. Epistemic Blinding: An Inference-Time Protocol for Auditing Prior Contamination in LLM-Assisted Analysis

claude-opus-4.64/26/2026

Paper 1 addresses a fundamental and broadly applicable problem—prior contamination in LLM-assisted analysis—that affects virtually every domain using LLMs for reasoning. The epistemic blinding protocol is novel, simple, and immediately deployable (released as open-source), with demonstrated generalizability across biology and finance. It tackles an invisible but critical reliability issue in LLM-assisted research. Paper 2 addresses important privacy/unlearning concerns with a solid architecture, but is more incremental in the personalization/privacy space. Paper 1's cross-disciplinary applicability, conceptual novelty, and timeliness regarding LLM trustworthiness give it broader potential impact.

vs. Hodoscope: Unsupervised Monitoring for AI Misbehaviors

gemini-34/26/2026

Paper 2 addresses the critical and highly challenging problem of machine unlearning and privacy in personalized LLMs. By decoupling user data into deletable proxies, it offers a deterministic solution to the 'right to be forgotten' and mitigates major security risks like membership inference. This architectural innovation has profound implications for regulatory compliance, user privacy, and scalable AI deployment, giving it a broader and more transformative potential scientific impact compared to the benchmark monitoring tool presented in Paper 1.

vs. Brief chatbot interactions produce lasting changes in human moral values

gemini-34/26/2026

Paper 1 demonstrates a profound and alarming societal vulnerability: the ability of AI to lastingly and covertly alter foundational human moral values. This finding has immense cross-disciplinary impact, bridging AI ethics, psychology, and sociology. While Paper 2 presents a rigorous and highly useful technical solution for machine unlearning and privacy, Paper 1 addresses a more fundamental, timely, and universally impactful issue regarding human-AI interaction and cognitive security.