Minh-Khoi Pham, Luca Cotugno, Alina Sirbu, Tai Tan Mai, Martin Crane, Marija Bezbradica
Predicting time-to-event outcomes such as mortality is a fundamental task in clinical decision-making, commonly addressed through survival analysis. While classical statistical and deep learning approaches have been widely studied, they typically require task-specific training and sufficient labeled data. Recent advances in tabular foundation models offer a new paradigm by learning general-purpose representations for structured data. However, their applicability to censored time-to-event prediction in clinical settings remains underexplored, as typical applications are restricted to discrete classification rather than survival analysis tasks. In this work, we propose a lightweight adaptation approach for applying tabular foundation models to clinical survival analysis by directly training a survival-aware head on top of the pretrained representations. We study representative architectures, including TabPFN, TabDPT, and TabICL, and adapt them using a multi-task logistic regression (MTLR) head to model right-censored time-to-event outcomes. We evaluate this approach on a diverse set of public survival benchmarks and two large-scale ICU cohorts, MIMIC-IV and eICU. Our results show that this transfer learning approach achieves competitive or superior performance compared to strong baselines. On MIMIC-IV, TabDPT-FT-MTLR reaches a C-index of 0.856, corresponding to a relative improvement of +1.4% over the best non-FM baseline (DeepSurv, 0.844) and +6.7% over the best zero-shot model (0.802). On eICU, TabICL-FT-MTLR achieves 0.797, yielding gains of +1.7% (DeepSurv, 0.784) and +6.4% (0.749), respectively. These findings highlight the importance of combining pretrained tabular representations with survival-aware objectives and suggest that tabular foundation models provide a practical and effective alternative for clinical survival prediction.
This paper proposes a lightweight adaptation strategy for applying tabular foundation models (TabPFN, TabDPT, TabICL) to clinical survival analysis by attaching a multi-task logistic regression (MTLR) head on top of frozen pretrained representations. The key insight is that pretrained tabular representations, originally designed for classification/regression, can be effectively repurposed for censored time-to-event prediction without modifying backbone weights. The paper contrasts this survival-aware adaptation approach against (a) zero-shot reformulation where survival is treated as a sequence of binary classification tasks, and (b) traditional survival baselines trained from scratch.
The contribution is primarily one of integration rather than fundamental novelty—combining two existing components (tabular FMs and MTLR heads) in a sensible way. However, the systematic evaluation and the demonstration that this simple combination works well across diverse clinical settings provides practical value.
The paper addresses a genuine practical need: simplifying the deployment of survival models in clinical settings where labeled data may be limited and modeling expertise scarce. The "freeze backbone, train lightweight head" paradigm is appealing for clinical deployment due to:
1. Reduced computational overhead compared to end-to-end deep survival models
2. Simplified hyperparameter tuning since only the head requires optimization
3. Potential for rapid adaptation to new clinical cohorts
However, the impact is somewhat bounded by several factors. The improvements over well-tuned DeepSurv are modest (1-2% relative), and the approach still requires some labeled survival data for head training, limiting its advantage over standard transfer learning approaches. The zero-shot setting, which would be most impactful for truly data-scarce scenarios, performs noticeably worse than the adapted version.
The clinical risk stratification analysis (Figure 1) is a strength that demonstrates practical utility beyond aggregate metrics, showing clearer separation of risk groups with survival-aware adaptation.
The paper is timely in two respects: (1) tabular foundation models are rapidly gaining traction (TabPFN, TabDPT, TabICL are all recent), and (2) there is growing interest in applying foundation model paradigms to clinical prediction tasks. The intersection of these trends—adapting tabular FMs specifically for survival analysis—is underexplored, making this a relevant contribution.
The concurrent work by Kim et al. (2026) on reformulation-based approaches and Seletkov et al. (2026) on Survival In-Context suggests this is an active research front. This paper's positioning as a simpler alternative to specialized pretraining (SIC) or temporal expansion (Kim et al.) is reasonable, though the inability to compare against SIC due to lack of public implementation is a limitation.
The paper's framing around "foundation models" should be interpreted carefully. The tabular FMs used here (especially TabPFN) are pretrained on synthetic data, not on clinical data. The transferability of synthetic-data representations to real clinical tasks is interesting but the mechanisms remain unexplained. The observation that "much of the difficulty in clinical survival analysis lies in representation learning rather than survival-specific loss design" is intriguing but not rigorously substantiated.
The venue (AIiH 2026, a workshop/conference paper) is appropriate for the contribution level. This work serves as a useful empirical study establishing that tabular FMs can work for survival analysis, laying groundwork for more sophisticated approaches.
Generated Jun 11, 2026
Paper 2 addresses a critical methodological gap in generative models for 3D molecular generation by introducing a principled uncertainty estimation method. This has profound implications for AI-driven drug discovery, allowing for better quality control and test-time scaling. While Paper 1 presents a valuable clinical application, Paper 2 offers higher methodological innovation and broader potential impact across the rapidly growing intersection of generative AI and computational chemistry.
Paper 2 is likely higher-impact: it tackles a timely, broadly relevant bottleneck (scalable LLM evaluation) with a principled uncertainty-aware ranking method combining probabilistic Bradley–Terry/Elo and conformal prediction with distribution-free coverage guarantees. Its applicability extends across model benchmarking, alignment, and product evaluation, and it directly addresses known judge biases/miscalibration. Paper 1 is solid and practical for clinical survival prediction, but is a narrower domain adaptation of existing tabular foundation models with incremental performance gains, thus likely more limited in cross-field impact.
Paper 2 has higher likely scientific impact due to broader relevance and conceptual novelty: it provides mechanistic insights into how on-policy distillation changes parameters (sparsity patterns, optimizer interactions, and geometric structure) across multiple LM/VLM settings, with actionable implications (subnetwork training, optimizer choice) for widely used post-training pipelines. This general analysis can influence practice across many domains using foundation models. Paper 1 is timely and useful for clinical survival prediction, but its contribution is a comparatively straightforward head adaptation with incremental performance gains and narrower domain scope.
Paper 1 offers a fundamentally novel geometric framework for understanding phase transitions in diffusion/flow-matching models, connecting caustic theory to generative AI dynamics. This theoretical contribution has broad implications across generative modeling, providing both conceptual understanding and practical tools (CBD). Paper 2, while methodologically sound and clinically relevant, represents an incremental adaptation—applying existing tabular foundation models to survival analysis with a known MTLR head. The novelty is limited to the combination rather than new theory. Paper 1's theoretical depth and breadth of impact across the rapidly growing generative AI field give it higher potential impact.
Paper 1 provides a highly concrete, timely adaptation of tabular foundation models for clinical survival analysis, a critical healthcare domain. It demonstrates strong methodological rigor with specific, quantitative improvements on major datasets (MIMIC-IV, eICU). Paper 2 proposes a general loss function with broad potential but lacks quantitative evidence in the abstract, making its actual impact more speculative.
Paper 2 likely has higher scientific impact due to stronger real-world applicability (clinical survival prediction), clear methodological contribution (survival-aware adaptation of tabular foundation models with MTLR for censoring), and broader immediate utility across healthcare and tabular ML. It is timely given growing interest in foundation models beyond text, and it validates on large, widely used ICU cohorts (MIMIC-IV, eICU) plus public benchmarks, suggesting robustness. Paper 1 is novel and relevant for AI alignment interpretability, but its impact may be narrower and more exploratory, with less direct deployment pathway.
Paper 2 introduces a more novel conceptual framework (Chain of Operators) that draws an innovative analogy between prompt engineering in LLMs and operator learning, enabling OOD generalization without retraining. This cross-pollination of ideas between foundation model prompting strategies and scientific computing/PDEs is highly innovative and has broader potential impact across computational science. Paper 1, while rigorous and practically useful, represents a relatively incremental adaptation (adding a survival head to existing tabular foundation models), combining known components rather than introducing a fundamentally new paradigm.
Paper 2 identifies and characterizes a general failure mode of in-context learning for structured data (“categorical prior lock-in”), with implications for any LLM-based conditional generation under distribution shift. This is novel, timely, and broadly impactful across ML, data synthesis, evaluation, privacy, and deployment, and it frames an important trade-off between adaptability and memorization risk. Paper 1 is practically useful for clinical survival prediction but is more incremental (adapting existing tabular foundation models with a known survival head) and its impact is narrower to survival/tabular transfer learning.
Paper 1 introduces a highly novel, interdisciplinary paradigm connecting physical oscillator dynamics to transformer attention, enabling low-power neuromorphic hardware implementations. Its theoretical depth and potential to shift paradigms in AI hardware give it a broader, more profound scientific impact compared to Paper 2's incremental, domain-specific application of tabular models to clinical survival analysis.
Paper 2 has higher potential impact due to a more novel, general framework for mechanistic/explicit behavioral modeling that integrates adaptive questioning and world-model probes directly into training. If validated, this could influence multiple areas (RL, interpretability, agent evaluation, world models, debugging and adaptation) with broad applicability beyond a single domain. Paper 1 is timely and practically useful for clinical survival analysis, but it is a comparatively incremental adaptation (pretrained tabular encoders + survival head) with narrower cross-field impact and limited methodological novelty relative to existing transfer-learning paradigms.