Uncertainty Aware Functional Behavior Prediction and Material Fatigue Assessment for Circular Factory

Nehal Afifi, Mehdi Khabou, Victor Mas, Jonas Hemmerich, Patric Grauberger, Stefan Dietrich, Volker Schulze, Sven Matthiesen

Jun 3, 2026

arXiv:2606.05334v1 PDF

cs.AI(primary)

#2603of 3355·Artificial Intelligence

#2603 of 3355 · Artificial Intelligence

Tournament Score

1330±46

10501800

44%

Win Rate

Wins

Losses

Matches

Rating

4.8/ 10

Significance5.5

Rigor4.5

Novelty5.5

Clarity6.5

Tournament Score

1330±46

10501800

44%

Win Rate

Wins

Losses

Matches

Rating

4.8/ 10

Significance

Rigor

Novelty

Clarity

Abstract

Returned products in circular factories re-enter production with heterogeneous degradation states, usage histories, and remaining capability. Reuse cannot be decided from the current inspection alone, because future function fulfillment and component integrity may evolve differently under the next service scenario. Existing PHM approaches support degradation prediction, but often target fixed operating conditions or isolated component benchmarks, while material-fatigue assessment is rarely linked to system-level functional prognosis. This paper addresses this gap for an angle grinder by combining uncertainty-aware functional prediction with component-level fatigue assessment in an instance-specific reliability workflow. The proposed framework combines the current tool state with recent force--torque usage windows. A convolutional encoder extracts loading patterns from spindle forces and shaft torque, and an LSTM backbone predicts nine functional variables as Gaussian mean and variance estimates. In parallel, the same loading history is translated into output-shaft fatigue information through finite-element-supported stress reconstruction, S--N/Miner damage evaluation with Haibach extension, and Paris-law crack-growth analysis. A streaming replay algorithm consolidates both branches into functional, material, and system reliability trajectories. Held-out tests show mean $2 %$ -tolerance accuracy of 0.9652 across nine outputs. Thermal variables are predicted near-perfectly, while drive motor current and load speed remain the most demanding dynamic outputs, with $R^{2}$ values of 0.9750 and 0.9924. Torque history is especially important for these variables, and the conventional LSTM outperforms GRU and xLSTM in the short-history setting. Reliability calibration is most informative for drive motor current, where predicted and observed exceedance probabilities ...

AI Impact Assessments

(1 models)

Scientific Impact Assessment

1. Core Contribution

This paper presents a dual-branch framework for assessing returned angle grinders in a circular factory context. The first branch uses a CNN-LSTM architecture with Gaussian negative log-likelihood training to predict nine functional variables (thermal, electrical, rotational, geometric) from the current tool state and recent force–torque usage windows. The second branch translates the same loading history into output-shaft fatigue information via FE-supported stress reconstruction, S–N/Miner damage accumulation with Haibach extension, and Paris-law crack growth analysis. A streaming replay algorithm (Algorithm 3) consolidates both branches into functional, material, and system reliability trajectories for instance-specific redeployment decisions.

The main novelty lies in the integration of data-driven functional prognosis with physics-based material fatigue assessment using shared operational loading data, framed specifically for circular manufacturing reuse decisions. This is a genuinely underexplored intersection—most PHM work targets isolated component degradation or fixed operating conditions, while material fatigue models rarely connect to system-level functional prognosis.

2. Methodological Rigor

Functional Branch: The CNN-LSTM architecture is straightforward but appropriate. The convolutional encoder for local temporal patterns in force–torque windows feeding into an LSTM for sequential context is a reasonable design choice. The increment-based prediction formulation (predicting changes rather than absolute values) is a sensible engineering decision that anchors forecasts to known initial states.

The ablation studies are well-structured:

Input ablation (Table 3) clearly demonstrates that torque history drives the largest accuracy gains for dynamic outputs (+0.2438 for drive motor current, +0.3397 for load speed)

Backbone comparison (LSTM vs. GRU vs. xLSTM) provides useful practical guidance, though the finding that conventional LSTM outperforms xLSTM in short-history settings is context-dependent and should not be over-generalized

Material Branch: The fatigue assessment follows established engineering practice (Basquin S–N, Palmgren–Miner, Haibach, Paris law). The FE-based stress reconstruction using Latin hypercube sampling of load components is methodologically sound. However, the material parameters are taken from literature (Li et al., 2023) rather than experimentally calibrated for the specific shaft specimens, introducing uncertainty that is acknowledged but not quantified.

Significant weaknesses in validation:

The functional model is validated on controlled test-bench data with a single repeating 100s load cycle—not representative of heterogeneous real-world usage

Only a single angle grinder's degradation trajectory is used (400 hours, 5 inspection points)

The 80/10/10 file-level split from this limited dataset raises concerns about generalization

The material branch produces negligible Miner damage (~10⁻²⁵) because service stresses (~2.88 MPa) are far below the endurance limit (468 MPa), meaning the integration is demonstrated algorithmically but not validated under meaningful material degradation conditions

3. Potential Impact

The framework addresses a real industrial need: circular manufacturing requires instance-specific reuse decisions that combine functional and structural perspectives. The concept of a unified functional–material reliability space is valuable for the emerging circular factory paradigm.

Practical applicability is currently limited by:

Dependence on controlled test-bench data rather than field data

The need for FE models and material characterization for each component type

The requirement for force–torque measurement infrastructure during operation

Validation on only one product type (angle grinder)

The Paris-law sensitivity analysis showing that amplifying the upper 10% of stress amplitudes by 1.6× reduces reuse from 31 to 3 cycles is a practically important finding—it demonstrates that rare high-load events dominate reusability, which has direct implications for how usage histories should be recorded and assessed.

4. Timeliness & Relevance

The paper is timely given growing interest in circular economy, remanufacturing, and sustainable manufacturing. The CRC 1574 "Circular Factory for the Perpetual Product" context is well-motivated. However, the gap between the vision (heterogeneous returned products, real redeployment decisions) and the current validation (single tool, controlled conditions, no actual degradation-critical material data) is substantial.

The linking of PHM with material science for reuse decisions is a genuine research gap that few papers have addressed. The formalization of redeployment as an admissibility problem in a functional–material reliability space is a useful conceptual contribution.

5. Strengths & Limitations

Key Strengths:

Well-formulated problem at the intersection of PHM, reliability engineering, and circular manufacturing

Clean algorithmic presentation (Algorithms 1–3) with clear separation of concerns

Uncertainty-aware prediction with demonstrated calibration quality (ECE < 0.01 for drive motor current)

Comprehensive ablation studies that yield actionable design insights

The streaming replay algorithm is a practical contribution for deployment

Honest discussion of limitations, particularly the lack of fatigue-critical validation data

Notable Limitations:

Single product instance with controlled, repetitive loading—no heterogeneity validation

Material branch produces no meaningful degradation signal under tested conditions

The "integration" is primarily algorithmic (shared input data, min-based reliability consolidation) rather than demonstrating physical coupling between functional and material degradation

Nine output variables are predicted, but only one (drive motor current) has sufficient exceedance events for reliability calibration

The claim of "first integrated assessment framework" may be overstated given the limited validation scope

Right pinion clearance has R² = 0.3381, indicating the model explains very little variance for some outputs

No comparison with existing PHM frameworks or alternative integration approaches

Reproducibility concerns: data appears proprietary, no code availability mentioned

Overall Assessment:

This paper makes a conceptually valuable contribution by framing the circular factory reuse problem as joint functional–material reliability assessment and providing an end-to-end computational framework. The functional prediction results are solid within their limited scope, and the material sensitivity analysis yields practical insights. However, the validation falls short of demonstrating the framework's value proposition—the material branch never encounters meaningful fatigue conditions, only one product instance is tested under controlled conditions, and the "integration" amounts to taking the minimum of two reliability values that never deviate from 1.0. The paper represents a promising proof-of-concept rather than a validated methodology, and its impact will depend heavily on follow-up work with more challenging and heterogeneous datasets.

Rating:4.8/ 10

Significance 5.5Rigor 4.5Novelty 5.5Clarity 6.5

Generated Jun 5, 2026

Comparison History (18)

vs. Seeing Time: Benchmarking Chronological Reasoning and Shortcut Biases in Vision-Language Models

gemini-3.16/6/2026

Paper 1 addresses a highly relevant and fast-moving field (Vision-Language Models), introducing a novel benchmark to evaluate chronological reasoning and expose critical shortcut biases. Benchmarks in AI tend to have broad scientific impact and high citation rates as they drive future model development. Paper 2 offers a rigorous and valuable engineering framework for circular manufacturing, but its impact is more niche and domain-specific compared to the foundational AI evaluation presented in Paper 1.

vs. Toward Pre-Deployment Assurance for Enterprise AI Agents: Ontology-Grounded Simulation and Trust Certification

claude-opus-4.66/6/2026

Paper 2 addresses a broader and more timely challenge—pre-deployment assurance for enterprise AI agents—which is relevant across multiple industries and intersects AI safety, governance, and regulation. Its ontology-grounded framework for trust certification is novel and has wide applicability as LLM-based agents proliferate in regulated sectors. Paper 1, while technically rigorous, addresses a narrower domain (circular manufacturing reliability for angle grinders) with more incremental contributions combining existing PHM techniques. Paper 2's cross-industry, cross-LLM validation and the growing urgency of AI agent governance give it higher potential impact.

vs. Critic-Guided Heterogeneous Multi-Agent Reasoning for Reliable Mathematical Problem Solving

claude-opus-4.66/6/2026

Paper 2 addresses a more novel and interdisciplinary problem—combining uncertainty-aware functional prediction with material fatigue assessment for circular manufacturing/remanufacturing. It integrates multiple engineering domains (PHM, materials science, reliability engineering, sustainability) with deep learning, targeting a timely real-world application in circular economy. Paper 1, while solid, applies a relatively incremental multi-agent LLM framework to a well-studied benchmark (GSM8K) with established techniques (generator-validator, critic feedback). Paper 2's methodological integration and practical relevance to sustainable manufacturing give it broader and deeper potential impact.

vs. What Should Agents Say? Action-state Communication for Efficient Multi-Agent Systems

claude-opus-4.66/6/2026

Paper 2 addresses a broadly relevant problem in multi-agent LLM systems—efficient inter-agent communication—which is timely given the rapid growth of LLM-based multi-agent frameworks. It proposes a general, reusable protocol (PACT) with clear practical benefits (reduced token costs, maintained performance) demonstrated across multiple topologies and production systems (OpenHands, SWE-agent). Paper 1, while technically rigorous, targets a narrow industrial application (circular factory angle grinder reuse) combining known techniques (CNN-LSTM, S-N curves, Paris law) with limited generalizability beyond its specific domain.

vs. Integrating Mechanistic and Data-Driven Models for Neurological Disorders through Differentiable Programming

gpt-5.26/6/2026

Paper 2 has higher likely impact: it presents a concrete, end-to-end, uncertainty-aware PHM workflow that couples data-driven functional prognosis with physics/FEA-based fatigue and crack-growth modeling, validated on held-out tests with detailed metrics. This methodological rigor and direct applicability to circular manufacturing/reliability decisions make near-term adoption plausible across industrial prognostics, maintenance, and sustainability. Paper 1 is a perspective/overview on hybrid differentiable modeling in neurology—timely and broad, but less novel empirically and lower immediate measurable impact than a validated, deployable framework.

vs. Boosting Brain-to-Image Decoding with TRIBE v2 Data Augmentation

gpt-5.26/6/2026

Paper 2 likely has higher scientific impact due to stronger novelty and broader cross-field relevance: leveraging a large pretrained fMRI encoding model (TRIBE v2) to generate synthetic neural data for data-efficient/zero-shot brain-to-image decoding. This directly addresses a central bottleneck in neuroscience and neuroAI (scarce labeled fMRI), is timely given foundation-model trends, and could generalize across tasks, scanners, and modalities. Paper 1 is methodologically rich and valuable for circular manufacturing/PHM, but its impact is more domain-specific and less likely to propagate widely beyond industrial prognostics.

vs. Safety Paradox: How Enhanced Safety Awareness Leaves LLMs Vulnerable to Posterior Attack

gemini-3.16/6/2026

Paper 1 identifies a fundamental paradox in current LLM alignment paradigms, demonstrating that enhanced safety awareness increases vulnerability to specific attacks. Given the widespread deployment and critical nature of LLM safety, this foundational discovery has broad, immediate implications across AI and cybersecurity. Paper 2, while methodologically sound and relevant to sustainable manufacturing, focuses on a much more niche application (angle grinder fatigue prediction), limiting its broader scientific impact compared to the findings in frontier AI models.

vs. Statistical Priors for Implicit Preferences: Decoupling Skill Selection as a Local Harness in Personal Agents

gpt-5.26/5/2026

Paper 2 has higher likely scientific impact due to stronger methodological rigor and clearer real-world applicability. It integrates uncertainty-aware sequence prediction with physics-based fatigue/crack-growth assessment into a unified, instance-specific reliability workflow—bridging ML and mechanistic modeling in PHM for circular manufacturing. This cross-disciplinary coupling (deep learning + finite-element-supported stress reconstruction + damage mechanics) is novel and broadly relevant to sustainable manufacturing, reliability engineering, and industrial AI. Paper 1 is timely for agent systems, but the contribution (local statistical preference harness decoupled from intent parsing) appears narrower and more incremental, with less demonstrated generality beyond personal-agent skill selection.

vs. RedditPersona: A Modular Framework for Community-Conditioned LLM Adaptation from Reddit

claude-opus-4.66/5/2026

Paper 2 addresses a more impactful real-world problem—circular manufacturing and sustainable reuse of products—by combining uncertainty-aware functional prediction with material fatigue assessment in a novel integrated framework. It bridges PHM, structural fatigue analysis, and system-level reliability in a way that is methodologically rigorous and practically relevant to circular economy goals. Paper 1, while solid, primarily offers a benchmarking/standardization framework for community-conditioned LLM adaptation on Reddit, which is more incremental and narrower in its cross-disciplinary impact and real-world applicability.

vs. Bidirectional Search for Longest Paths: Case for Front-to-Front Heuristics

claude-opus-4.66/5/2026

Paper 1 addresses a timely, industrially relevant problem at the intersection of circular economy, prognostics/health management, and reliability engineering. It integrates uncertainty-aware deep learning with physics-based fatigue assessment in a novel framework for reuse decisions—bridging multiple disciplines with practical manufacturing applications. Paper 2 makes a solid algorithmic contribution to bidirectional search for longest-path problems, but targets a narrower combinatorial optimization niche with more limited real-world applicability and cross-disciplinary impact.

vs. SCI-PRM: A Tool Aware Process Reward Model for Scientific Reasoning Verification

gpt-5.26/5/2026

Paper 1 likely has higher impact due to novelty and broad, timely relevance: it introduces a tool-aware process reward model and a large dataset (SCIPRM70K) to verify scientific reasoning with explicit tool use, addressing LLM hallucination/verification—an urgent cross-domain problem spanning AI, scientific computing, and tool-augmented agents. The claimed benefits (test-time scaling and improved RL via dense rewards) could generalize across many scientific tasks and models. Paper 2 is rigorous and application-relevant for circular manufacturing/PHM, but is narrower in scope (single tool/system focus) with more incremental methodological innovation.

vs. Fog of Love: Engineering Virtuous Agent Behavior with Affinity-based Reinforcement Learning in a Game Environment

gpt-5.26/5/2026

Paper 1 has higher likely scientific impact due to stronger methodological rigor and clearer real-world applicability: it integrates uncertainty-aware functional prognosis with physics-based fatigue/crack-growth assessment into a reliability workflow for circular manufacturing, a timely industrial need. The hybrid ML+mechanistic approach and reported quantitative performance suggest robustness and translational potential across PHM, manufacturing, and reliability engineering. Paper 2 is novel in applying affinity-based RL to a richer game setting, but remains largely a simulation/game-domain contribution with less demonstrated generalizability and weaker evidence of downstream impact beyond AI-safety interpretability research.

vs. BADGER: Bridging Agentic and Deterministic Evaluation for Generative Enterprise Reasoning

gemini-3.16/5/2026

Paper 2 addresses a critical bottleneck in the rapidly expanding field of Generative AI and LLM agents: evaluation in enterprise settings. Evaluation frameworks in AI typically garner high citations and broad adoption across academia and industry. While Paper 1 offers a rigorous, specialized approach for circular manufacturing, Paper 2's focus on text-to-SQL and agentic reasoning guarantees wider applicability, greater timeliness, and higher overall scientific impact.

vs. GTBench: A Curriculum-Grounded Benchmark for Evaluating LLMs as Mathematical Research Assistants in Graph Theory

gemini-3.16/5/2026

Paper 2 introduces a novel benchmark for evaluating LLMs in mathematical reasoning, a highly active and rapidly evolving field. Benchmarks typically drive significant subsequent research and gather high citations, impacting both AI and mathematics. In contrast, Paper 1 presents a valuable but narrower application of predictive maintenance and fatigue assessment for circular factories, which likely has a more limited audience and targeted real-world application.

vs. Topological texture analysis of microscopy images of dynamic casein gelation and its relation to rheological properties

gpt-5.26/5/2026

Paper 2 has higher impact potential due to its broader real-world applicability (circular manufacturing, predictive maintenance, reliability for returned products), timely relevance to sustainability/Industry 4.0, and cross-field reach (ML uncertainty quantification + fatigue/fracture mechanics + system reliability). The integrated, instance-specific workflow linking functional prognosis to physics-based fatigue assessment is methodologically substantial and more likely to generalize across industrial assets. Paper 1 is novel and rigorous within food/material microscopy analytics, but its impact is more niche and primarily methodological within a narrower application domain.

vs. The DeepSpeak-Agentic Dataset

gemini-3.16/5/2026

Paper 1 addresses the highly timely and critical challenge of AI forensics and human-agent interactions. Datasets and benchmarks in generative AI security typically achieve widespread adoption and high citation rates across multiple disciplines like computer vision, NLP, and cybersecurity. While Paper 2 demonstrates strong methodological rigor, its focus on predictive maintenance for circular factories is a niche industrial application, resulting in a narrower overall scientific impact.

vs. DMF: A Deterministic Memory Framework for Conversational AI Agents

claude-opus-4.66/5/2026

Paper 2 (DMF) addresses a broadly relevant problem in conversational AI—memory management for LLM-based agents—with a novel deterministic approach that eliminates LLM calls from the memory loop, achieving dramatic token cost reductions (5x-242x). This has immediate, wide-reaching practical impact given the explosive growth of LLM-based agents. Paper 1 tackles a narrower domain (circular factory reliability for angle grinders) with a competent but incremental combination of existing techniques (CNN-LSTM, FEA, S-N curves). While rigorous, its applicability is limited to specific manufacturing contexts, whereas Paper 2's framework generalizes across all conversational AI applications.

vs. RelGT-AC: A Relational Graph Transformer for Autocomplete Tasks in Relational Databases

claude-opus-4.66/5/2026

Paper 2 addresses a broader, more impactful problem combining uncertainty-aware functional prediction with material fatigue assessment for circular manufacturing—a topic with significant environmental and industrial relevance. It integrates multiple disciplines (PHM, fatigue mechanics, reliability engineering) into a novel unified framework with real-world applicability to sustainable manufacturing. Paper 1 offers incremental improvements to an existing architecture (RelGT) on a specific benchmark (RelBench v2), with narrower scope and more limited novelty (TF-IDF encoding, column masking). Paper 2's interdisciplinary breadth and timeliness regarding circular economy give it higher potential impact.