Differentially Private Synthetic Data via APIs 4: Tabular Data

Toan Tran, Arturs Backurs, Zinan Lin, Victor Reis, Li Xiong, Sergey Yekhanin

Jun 6, 2026arXiv:2606.08259v1

cs.LG

#1918of 5669·cs.LG

#1918 of 5669 · cs.LG

Tournament Score

1437±42

10501750

53%

Win Rate

Wins

Losses

Matches

Rating

7.2/ 10

Significance7.5

Rigor7

Novelty6.8

Clarity8

Abstract

This paper investigates the problem of generating synthetic tabular data with differential privacy (DP) guarantees, enabling data sharing in sensitive domains. Despite extensive study, state-of-the-art methods often focus on minimizing low-order marginal query errors and overlook the challenges posed by high-order correlations. To address this gap, we extend the Private Evolution (PE) framework, originally developed for DP-compliant image and text synthesis, to tabular data. We introduce Tab-PE -- an algorithm for synthetic tabular data generation under DP constraints. Tab-PE iteratively improves a candidate dataset via an evolutionary process that leverages tabular-specialized operators to produce variations, privately scores them, and selects the highest-quality samples to retain and propagate. In contrast to the original PE, which relies on large foundation models, Tab-PE employs heuristic operators with significantly lower computational costs, making PE more practical and scalable for tabular data. Through extensive experiments on real-world and simulation datasets, we demonstrate that Tab-PE substantially outperforms prior baselines on datasets exhibiting high-order correlations. Compared to the best baseline -- AIM, Tab-PE improves classification accuracy by up to 10% while running 28 times faster.

AI Impact Assessments

(1 models)

Scientific Impact Assessment: Tab-PE — Differentially Private Synthetic Tabular Data via Private Evolution

1. Core Contribution

The paper identifies a genuine blind spot in the DP synthetic tabular data literature: existing state-of-the-art methods (AIM, PrivMRF, PrivGSD, etc.) are fundamentally built on low-order marginal queries, and standard benchmarks (Adult, Bank, Census) are dominated by low-order correlations, making this limitation invisible. The authors propose Tab-PE, which adapts the Private Evolution (PE) framework to tabular data using lightweight heuristic operators — random initialization, random-walk variation with scheduled mutation decay, and DP nearest-neighbor histogram scoring — instead of the foundation models used in PE for images/text.

The key insight is that full-record nearest-neighbor matching implicitly captures high-dimensional dependencies without explicitly enumerating exponentially many marginal queries. This sidesteps the curse of dimensionality that plagues marginal-based approaches when high-order correlations matter.

2. Methodological Rigor

Strengths:

The privacy analysis is clean and well-grounded, reusing the standard Gaussian mechanism composition from the original PE framework. The sensitivity analysis (each private sample affects exactly one histogram bin) is straightforward and correct.

The formal definition of k-way correlation via total correlation gaps (Equation 1) and Proposition A.1 connecting tree depth gaps to correlation order provides a principled way to characterize when high-order correlations exist.

The experimental design is thorough: XOR stress tests provide clean theoretical intuition, SCM simulations offer realistic but controlled settings with known ground truth, and real-world datasets validate practical applicability.

The two-stage selection strategy (sampling then ranking) is well-motivated and ablated.

Concerns:

The distance metric (Equation 4) uses a simple weighted combination of Hamming distance for categoricals and normalized squared Euclidean for numericals. This is acknowledged as a limitation but is a meaningful one — in high-dimensional spaces with many irrelevant features, this metric may degrade significantly.

The method assumes known numerical bounds and (effectively) known class distributions. While the authors show robustness to noisy class counts, the bounds assumption is non-trivial in practice.

The number of synthetic samples is set to 10-20% of the original dataset at ε=1.0, which could be limiting for downstream tasks requiring larger datasets. The oversampling experiment (random duplication) is simplistic.

Hyperparameter sensitivity is explored but the method has many parameters (T, T_sampling, m, μ_init, μ_final, γ, λ), and optimal settings may vary across datasets.

3. Potential Impact

Practical significance:

The 28× speedup over AIM while achieving better accuracy on high-order datasets is compelling. Running entirely on CPUs without GPUs significantly lowers the barrier to adoption.

The identification that standard benchmarks mask a fundamental limitation of existing methods is an important methodological contribution that could redirect evaluation practices in the field.

The new benchmark suite (XOR, SCM simulations, curated high-order real-world datasets) fills a genuine evaluation gap.

Broader applicability:

Healthcare, finance, and other sensitive domains often have complex multi-feature interactions (e.g., drug interactions, financial fraud patterns) where high-order correlations are critical. Tab-PE directly addresses this need.

The demonstration that PE can work with trivially simple operators (no foundation models, no simulators) expands the conceptual scope of the PE framework.

Limitations in impact:

On standard low-order benchmarks, Tab-PE is ~1% behind AIM, which may limit adoption in settings where users don't know a priori whether their data has high-order correlations.

The gap to non-private upper bounds remains large (e.g., 30% accuracy gap on Artificial Characters), suggesting the method, while better than alternatives, is still far from solving the problem.

4. Timeliness & Relevance

The paper addresses a timely need. As DP synthetic data moves toward real-world deployment, the limitations of marginal-based methods become increasingly important. The PE framework has gained significant traction for images and text, and extending it to tabular data — the most common data modality in practice — is a natural and important step. The concurrent finding by Swanberg et al. (2025) that LLM-based PE for tabular data underperforms makes this contribution more valuable, as it shows that the right API design matters more than model sophistication.

5. Strengths & Limitations

Key strengths:

Clear identification of a systematic evaluation gap in the field

Elegant simplicity of the method — no foundation models, no training, CPU-only

Comprehensive experimental coverage across simulation and real-world settings

Strong computational efficiency advantages

Well-structured two-stage refinement with principled ablations

Open-source code

Notable weaknesses:

The method's advantage is largely confined to datasets with demonstrable high-order correlations; for the more common low-order case, it offers no improvement

The naive distance metric is a significant limitation for very high-dimensional or sparse data

The flattened MNIST experiment, while impressive, conflates "tabular" with "flattened image" — the practical relevance is questionable

Limited theoretical analysis of convergence or approximation quality beyond the privacy guarantee

The datasets with "high-order correlations" are somewhat cherry-picked; the paper would benefit from a more systematic survey of how common such correlations are in practice

Overall Assessment:

This is a solid contribution that identifies a real problem, proposes a clean and practical solution, and demonstrates its effectiveness convincingly. The impact is somewhat bounded by the specificity of the setting (high-order correlations) and the remaining accuracy gap to non-private baselines, but the efficiency gains and the reframing of evaluation practices could have lasting influence on the field.

Rating:7.2/ 10

Significance 7.5Rigor 7Novelty 6.8Clarity 8

Generated Jun 9, 2026

Comparison History (19)

Wonvs. Latent World Recovery for Multimodal Learning with Missing Modalities

Paper 2 addresses a critical bottleneck in sensitive data sharing across numerous fields (healthcare, finance, etc.) by advancing differentially private tabular data synthesis. Its approach successfully captures complex high-order correlations while delivering highly quantifiable and impressive improvements (up to 10% better accuracy and 28x faster than the state-of-the-art baseline). While Paper 1 offers a valuable methodology for missing modalities in multi-omics, Paper 2's broader applicability to virtually any domain utilizing sensitive tabular data, combined with its substantial scalability and efficiency gains, suggests a higher potential for widespread cross-disciplinary impact.

gemini-3.1-pro-preview·Jun 11, 2026

Lostvs. Bootstrapped Monitoring: Leveraging Transparent Reasoning to Oversee Stronger AI Agents

Paper 2 likely has higher impact due to its novelty and timeliness in AI safety/control, proposing a general oversight protocol (bootstrapped monitoring) relevant to rapidly advancing frontier agents. It targets a widely recognized real-world risk (monitoring capability gaps and collusion) and could influence both alignment research and deployment practices across domains where agents act. While Paper 1 is methodologically solid and practically useful for DP tabular synthesis, it is a more incremental advance within an established subfield with narrower cross-field reach than AI control paradigms.

gpt-5.2·Jun 11, 2026

Lostvs. Breaking the Tokenizer Barrier: On-Policy Distillation across Model Families

Paper 2 addresses a fundamental limitation in LLM knowledge distillation—the tokenizer barrier between model families—which has broad implications across the entire LLM ecosystem. Enabling cross-tokenizer on-policy distillation unlocks numerous teacher-student combinations previously impossible, with wide applicability in post-training pipelines. Paper 1, while solid and practical for DP synthetic tabular data, represents a more incremental extension of the Private Evolution framework to a specific data modality. Paper 2's impact spans more broadly across the rapidly growing LLM field and enables a paradigm shift in how distillation is conducted.

claude-opus-4-6·Jun 9, 2026

Lostvs. Assessing Sample Quality in Conditional Generation under Compositional Shift

Paper 1 addresses a fundamental bottleneck in applying generative AI to scientific discovery: evaluating generated samples in extrapolative regimes where no ground truth exists. By providing a novel, reference-free trust score, it unlocks broader applications in fields like biological imaging and materials science. While Paper 2 offers significant improvements in differentially private tabular data generation, Paper 1's conceptual innovation and direct relevance to accelerating empirical scientific research give it a higher potential for broad scientific impact.

gemini-3.1-pro-preview·Jun 9, 2026

Lostvs. When Do Local Score Models Extrapolate Across Size? A Diagnostic Theory and Benchmark

Paper 2 has higher potential impact due to its broadly applicable theoretical framing of size extrapolation in local score-based generative models, a key pain point in scientific ML (physics, chemistry, materials). It contributes new theory (quasi-locality via Gaussian-smoothed scores, size-uniform comparison theorem) plus a diagnostic benchmark (FDLF) with exact, controllable ground truth, enabling rigorous evaluation across methods. The insights can influence model design and evaluation standards across diffusion/score modeling. Paper 1 is valuable and practical for DP tabular synthesis, but is more application-narrow and more incremental relative to an active DP synthesis landscape.

gpt-5.2·Jun 9, 2026

Wonvs. Lost in the Non-convex Loss Landscape: How to Fine-tune the Large Time Series Model?

Paper 2 likely has higher impact: it advances a timely, high-stakes area (differentially private synthetic tabular data) with broad applicability in healthcare, finance, and public-sector data sharing. Extending Private Evolution to tabular data while removing reliance on large foundation models is a notable innovation with clear practicality (much faster) and addresses an important unmet need (high-order correlations). The methodological framing (DP guarantees + extensive evaluation) and cross-field relevance of DP data release give it wider potential reach than Paper 1’s more domain-specific fine-tuning technique for large time-series models.

gpt-5.2·Jun 9, 2026

Wonvs. The Confidence Trap: Calibration Attacks for Graph Neural Networks

Paper 2 addresses the broadly important problem of differentially private synthetic data generation for tabular data, which has wide real-world applications across healthcare, finance, and government. It extends the Private Evolution framework to a new domain with practical improvements (28x faster, 10% accuracy gain), addressing a gap in handling high-order correlations. Paper 1, while technically sound, targets a niche problem (calibration attacks on GNNs) with a narrower audience. Paper 2's combination of privacy guarantees, practical scalability, and broad applicability across sensitive data domains gives it higher potential impact.

claude-opus-4-6·Jun 9, 2026

Wonvs. Causal Longitudinal Prior-Fitted Networks for Counterfactual Outcome Prediction

Paper 2 likely has higher impact: it advances a broadly applicable, timely problem (differentially private synthetic tabular data) with direct real-world utility across healthcare, finance, and public-sector data sharing. Methodologically, it extends an existing DP synthesis framework to a new data modality with a practical, compute-efficient design and reports substantial empirical gains (utility and speed) on real datasets, addressing high-order correlation shortcomings of prior work. Paper 1 is innovative but more specialized to longitudinal causal forecasting and depends on synthetic pretraining assumptions, likely narrowing immediate adoption.

gpt-5.2·Jun 9, 2026

Lostvs. Consistency Training Along the Transformer Stack

Paper 1 introduces a novel, unified framework for AI alignment through consistency training across transformer internals, addressing multiple safety threats with cross-threat generalization. This has broad implications for AI safety—a critically timely field—and provides both practical techniques and mechanistic insights. Paper 2 makes a solid contribution to DP synthetic tabular data but is more incremental, adapting an existing framework (Private Evolution) to a new modality. Paper 1's breadth of impact across alignment, interpretability, and safety, combined with its methodological novelty, gives it higher potential impact.

claude-opus-4-6·Jun 9, 2026

Lostvs. Tangram: Unlocking Non-Uniform KV Cache for Efficient Multi-turn LLM Serving

Paper 1 addresses a critical bottleneck in the highly active field of LLM deployment: KV cache memory and bandwidth pressure. By proposing a system that improves throughput by up to 2.6x while maintaining accuracy, it offers massive immediate economic and practical benefits for real-world AI applications. While Paper 2 presents a valuable contribution to privacy-preserving synthetic data, the scale, timeliness, and broader industry reliance on efficient LLM serving give Paper 1 a significantly higher potential for widespread scientific and technological impact.

gemini-3.1-pro-preview·Jun 9, 2026

#1918of 5669·cs.LG

#1918 of 5669 · cs.LG

Tournament Score

1437±42

10501750

53%

Win Rate

Wins

Losses

Matches

Rating

7.2/ 10

Significance7.5

Rigor7

Novelty6.8

Clarity8