Kexuan Zhang, Xiaobei Zou, Cesare Alippi, Gary G. Yen, Yang Tang
Recent advances in Large Language Models (LLMs) have opened new possibilities for time series forecasting by enabling alignment between temporal patterns and pretrained word embeddings. However, most LLM-based methods overlook the heterogeneous nature of time series, where dynamic fluctuations and invariant semantics are entangled. This entanglement introduces spurious correlations during the alignment, as dynamic components act as confounders by simultaneously influencing invariant components and the resulting aligned embeddings. To address this issue, a variable-level alignment framework CVAformer is proposed. CVAformer explicitly disentangles each variable into invariant and dynamic components just before alignment, and applies causal intervention to mitigate the confounding effect of the dynamics. To better support variable-level alignment, CVAformer replaces the standard causal attention in LLMs with a non-causal attention mechanism that captures interactions among variables at each time step. Extensive experiments across long-term, short-term, few-shot, and zero-shot forecasting settings indicate that CVAformer matches or exceeds state-of-the-art performance on most datasets, and in some cases achieves notably better accuracy. Experimental results validate the effectiveness of variable-level alignment and dynamic disentanglement in CVAformer, offering a new perspective for LLM-based time series tasks.
CVAformer introduces a variable-level alignment framework for LLM-based time series forecasting that addresses the entanglement between invariant semantics and dynamic fluctuations during cross-modal alignment. The paper makes three interrelated contributions: (1) a causal formulation treating dynamic components as confounders in the alignment process, with backdoor adjustment to debias the alignment; (2) a decomposition mechanism separating invariant and dynamic components of each variable; and (3) replacement of causal (autoregressive) attention with non-causal attention to properly handle unordered inter-variable dependencies.
The problem formulation is conceptually appealing. The observation that dynamic fluctuations act as confounders causing "semantic anchor drift" when aligning time embeddings with word embeddings is a novel framing. The structural causal model (SCM) in Figure 1 provides an intuitive explanation for why entangled embeddings produce unstable alignments.
Causal Framework: The backdoor adjustment derivation (Equations 4-7) is mathematically sound in principle. However, the actual implementation departs considerably from the formal causal framework. The "causal intervention" is approximated via a CausalEncoder that computes a summary statistic from concatenated mean embeddings and covariance matrices, followed by a soft gating mechanism. This is a significant gap between the theoretical motivation and the practical implementation. The paper does not provide formal guarantees that this approximation satisfies the conditions required for valid backdoor adjustment, nor does it empirically verify that the confounding effect is truly eliminated rather than merely attenuated.
Decomposition: The separation into invariant and dynamic components relies on CausalCov blocks and an MLP, regularized by a MoCo-style contrastive loss. While reasonable, the paper does not provide evidence that the decomposition actually succeeds in separating these semantically distinct components (e.g., through visualization of decomposed components or quantitative metrics of disentanglement quality).
Experimental Design: The experiments are comprehensive, covering long-term, short-term, few-shot, and zero-shot settings across standard benchmarks. Standard deviation is reported for CVAformer but not for baselines, making statistical significance difficult to assess. The comparison includes both LLM-based and traditional deep learning baselines, which is appropriate.
The paper addresses a genuine limitation in LLM-based time series forecasting—the naive alignment of temporal embeddings with linguistic embeddings without accounting for the heterogeneous nature of time series signals. This is a problem that will grow in importance as more researchers attempt to leverage LLMs for time series tasks.
The variable-level alignment paradigm offers a conceptually cleaner alternative to patch-level or time-step-level tokenization. The non-causal attention modification is a simple but principled design choice that could be broadly adopted. However, the causal disentanglement component, while theoretically motivated, adds considerable complexity and its practical benefits appear modest in several benchmarks.
The framework's compatibility with different LLM backbones (GPT-2, BERT, DeepSeek) is a strength that enhances its potential for adoption, though the experiments with alternative backbones are limited to three datasets.
The paper is well-timed, sitting at the intersection of two active research areas: LLM adaptation for non-NLP domains and time series forecasting. The growing interest in foundation models for time series makes this contribution relevant. The causal perspective on alignment is timely given increasing attention to causal reasoning in machine learning.
The computational efficiency analysis (Figure 8) is informative but limited to one dataset/setting. The claim of "moderate training time" is relative—CVAformer is significantly slower than DLinear while being faster than TimeLLM. The dual-branch architecture (temporal + textual) with three loss terms adds complexity that may limit adoption compared to simpler approaches.
The paper's contribution is incremental but meaningful within the LLM-for-time-series niche. The causal framing is the main novelty, but the gap between theory and implementation somewhat diminishes its impact. The non-causal attention modification, while less novel, may prove to be the more practically influential contribution.
Generated Jun 9, 2026
Paper 2 offers fundamental theoretical insights into neural network training dynamics, linking physics concepts (conservation laws) with deep learning. While Paper 1 provides a useful methodological improvement for time series forecasting, Paper 2's rigorous mathematical proofs and introduction of 'tensorizable networks' have broader foundational impact, offering a deeper understanding that applies across various deep learning architectures and tasks.
Paper 2 is likely to have higher impact due to a clearer methodological contribution: the first dynamic assortment model learning unknown choice parameters on both platform sides, with polylogarithmic regret and a matching lower bound (rate-optimality), indicating strong rigor and theoretical significance. Its applications to two-sided marketplaces (gig platforms, retail marketplaces, ad exchanges) are direct and broad within operations research, economics, and online learning. Paper 1 is timely and applied, but impact may be narrower and more incremental in a crowded LLM-time-series space, with weaker guarantees and potentially harder-to-validate causal claims.
Paper 1 addresses a highly active and widely applicable area by leveraging LLMs for time series forecasting. Its novel approach to disentangling dynamic and invariant semantics, combined with strong empirical results across various forecasting settings, promises immediate practical utility and broad impact across multiple domains. While Paper 2 provides rigorous theoretical contributions to learning theory, its impact is likely confined to a narrower academic community compared to the ubiquitous real-world applications of time series forecasting.
Paper 2 addresses a more broadly impactful problem at the intersection of LLMs and time series forecasting—two highly active research areas. Its causal disentanglement framework (CVAformer) introduces novel theoretical contributions (causal intervention for confounding in alignment) applicable across multiple forecasting settings (long-term, short-term, few-shot, zero-shot). The methodology is more generalizable and timely given the surge in LLM adaptation research. Paper 1, while practically useful for e-commerce, addresses a narrower application domain with more incremental algorithmic contributions (biclustering, greedy search, bandits).
Paper 1 addresses a fundamental bottleneck in generative AI for scientific discovery: evaluating novel generated samples without a ground truth reference. Its framework enables reliable exploration of unobserved conditions, with demonstrated impact in biological imaging. This solves a broader and more foundational problem in AI-driven science compared to Paper 2, which focuses on improving LLM-based time series forecasting.
Paper 2 likely has higher impact due to its broader applicability and timeliness: improving LLM-based time series forecasting can affect many domains (finance, energy, healthcare, operations) and intersects NLP, causal inference, and forecasting. Its causal semantic alignment and variable-level disentanglement address a widely recognized issue (spurious correlations/confounding) and could generalize to other multimodal/sequence alignment problems. Paper 1 is novel and important for safety-critical RL, but targets a narrower niche (offline safe RL under poisoning) and may see slower real-world uptake due to deployment barriers.
Paper 2 addresses a more impactful problem at the intersection of LLMs and time series forecasting, which is a highly active research area. Its novel causal intervention framework for disentangling invariant and dynamic components offers broader methodological contributions applicable across multiple forecasting settings (long-term, short-term, few-shot, zero-shot). Paper 1, while solid, applies an existing MGN framework to structural mechanics with relatively limited training data and represents more of an incremental extension. Paper 2's broader applicability across domains and its contribution to the rapidly growing LLM adaptation literature gives it higher impact potential.
Paper 2 offers a foundational, modality-agnostic framework that solves a major memory bottleneck in neural fields. Its massive efficiency gains (42x less memory) and demonstrated applicability across diverse domains (images, 3D shapes, climate fields) give it broader scientific impact and generalization potential compared to Paper 1's more narrowly focused domain of time-series forecasting.
Paper 1 addresses the broadly impactful problem of LLM-based time series forecasting with a novel causal framework (CVAformer) that disentangles invariant and dynamic components, applying causal intervention to mitigate confounding. It demonstrates strong results across multiple forecasting settings (long-term, short-term, few-shot, zero-shot), suggesting wide applicability. Paper 2, while practically useful, addresses the narrower niche of Text-to-Cypher benchmark generation for enterprise property graphs. Paper 1's methodological contributions (causal disentanglement, variable-level alignment, non-causal attention) have broader theoretical and cross-domain implications.
Paper 2 is more novel and timely, proposing a causal-intervention-based disentanglement framework for LLM-based time-series forecasting—a rapidly growing area with broad cross-domain relevance. Its variable-level alignment and architectural change (non-causal attention for inter-variable interactions) offer a general method applicable to many forecasting settings (few/zero-shot, long/short horizon), increasing real-world impact. Paper 1 is a careful, rigorous replication/diagnostic study in a narrow airline-profit context; it strengthens methodological understanding (PCA/kPCA, clustering validity) but is less broadly innovative and has more limited applicability beyond similar clustering analyses.