Teaching Values to Machines: Simulating Human-Like Behavior in LLMs
Asaf Yehudai, Naama Rozen, Ariel Gera
Abstract
Large Language Models (LLMs) demonstrate a remarkable capacity to adopt different personas and roles; however, it remains unclear whether they can manifest behavior that adheres to a coherent, human-like value structure. In this work, we draw on established psychological value theory to induce human-like values in LLMs and assess their alignment with patterns observed in human studies. Using validated psychological questionnaires, we conduct large-scale experiments -- over 5 million questions -- to evaluate value structures and value-behavior relationships in leading LLMs and compare them to humans. Our findings reveal strong agreement between value-prompted LLMs and humans across both dimensions. Moreover, incorporating human value distributions enhances population-level simulations with value-induced LLMs. These findings highlight the potential of value-induced LLMs as effective, psychologically grounded tools for simulating human behavior.
AI Impact Assessments
(1 models)Scientific Impact Assessment
1. Core Contribution
This paper investigates whether LLMs can be systematically steered to exhibit coherent, human-like value structures using Schwartz's theory of basic human values, and whether these induced values translate into human-aligned behavioral patterns. The core novelty lies in the comprehensive, three-level analysis: (1) demonstrating that value-prompting induces internally coherent value structures in LLMs, (2) showing that value-behavior relationships in value-prompted LLMs correlate significantly with those observed in human psychological studies, and (3) exploring population-level simulation strategies that incorporate human value distributions. The paper claims to be the first comprehensive study of value–behavior relationships in LLMs, which is a meaningful distinction from prior work that examined either LLM values in isolation or behavioral steering without grounding in formal psychological theory.
2. Methodological Rigor
The experimental design is ambitious in scale (5M+ questions, 7 LLMs, 7 psychological instruments) and draws on well-validated psychological tools (PVQ, BFI-2, Prosocialness Scale, etc.). The use of established correlation-based comparison methods (MDS, Procrustes analysis, Pearson correlations of vectorized correlation matrices) is appropriate and well-grounded in the psychometric literature.
However, several methodological concerns deserve attention:
3. Potential Impact
The paper has potential impact across multiple domains:
The dual-use concern raised in the ethics section—that the same techniques could be used to create convincing anti-social personas—is valid and important.
4. Timeliness & Relevance
This work is highly timely. The use of LLMs as proxies for human subjects is an active and growing research area, and the question of whether LLMs can faithfully reproduce known psychological structures is directly relevant. The paper addresses a genuine gap: prior work has shown LLMs can adopt personas, but the systematic evaluation of whether these personas exhibit coherent internal value structures with corresponding behavioral implications has been lacking.
The inclusion of recent models (Qwen3, GPT-OSS series) makes the results current, though the rapid pace of model development means findings may not generalize to future architectures.
5. Strengths & Limitations
Key Strengths:
Notable Limitations:
Additional Observations:
Summary
This is a solid, well-executed study that makes a meaningful contribution to the growing literature on LLMs as behavioral simulators. Its primary value lies in the comprehensive, psychologically-grounded evaluation framework rather than in any single surprising finding. The results are encouraging but not conclusive—the moderate correlation strengths for some behavioral tests suggest that value-prompted LLMs are useful approximations rather than faithful replicas of human value-behavior dynamics.
Generated May 29, 2026
Comparison History (13)
Paper 2 has higher estimated scientific impact due to stronger novelty in bridging validated psychological value theory with LLM behavior, exceptional methodological scale (5M questionnaire items) and comparability to human studies, and broad cross-field relevance (AI alignment, computational social science, psychology, HCI, policy). Its applications to population-level simulation and behavior modeling are widely useful and timely. Paper 1 is highly relevant and practical for agent safety, but key claims (e.g., parity with “GPT-5.4”) are hard to assess from the abstract and impact may be narrower to agent security/guardrails despite open release.
Paper 1 addresses a critical, timely bottleneck in autonomous AI: the 'Outcome-Process Gap' where task success masks dangerous or erroneous agent behaviors. By providing a large-scale dataset (OpenClawBench) and demonstrating that nearly 10% of 'successful' executions contain anomalies, it directly challenges current evaluation paradigms. While Paper 2 is an impressive interdisciplinary study on LLM personas, Paper 1 provides foundational infrastructure essential for the safe, reliable real-world deployment of agentic systems, giving it broader immediate utility and impact in AI safety and engineering.
Paper 2 bridges AI and psychology by systematically studying and inducing human values in LLMs. Its massive experimental scale and interdisciplinary approach offer broad applications in AI safety, alignment, and simulating human populations for social sciences. Paper 1 offers a strong technical contribution to agentic skill evolution, but Paper 2's focus on value alignment addresses a more critical, universally relevant challenge with broader societal and cross-disciplinary impact.
Paper 2 has higher potential impact due to its broader cross-disciplinary relevance (AI alignment, computational social science, psychology), strong timeliness around value alignment and human behavior simulation, and large-scale empirical methodology (5M+ questionnaire items) grounded in validated psychological instruments. Its results could influence evaluation standards, agent design, and policy-facing simulations. Paper 1 is novel and practically useful for low-data professional domains via web-interaction memory, but its impact is likely narrower (agentic retrieval/automation) and may be more incremental relative to rapidly evolving web-agent frameworks.
While Paper 1 offers a valuable practical benchmark for optimizing RAG systems, Paper 2 demonstrates higher potential scientific impact due to its profound interdisciplinary reach. By successfully bridging established psychological value theory with large-scale LLM behavior, it advances AI alignment, cognitive modeling, and computational social science. The ability to simulate psychologically grounded human populations opens up transformative applications across sociology, economics, and human-computer interaction, offering broader foundational scientific implications than an architectural search framework.
Paper 2 has broader scientific impact due to its interdisciplinary relevance spanning AI, psychology, and social simulation. It addresses the fundamental question of whether LLMs can embody coherent human-like value structures, with implications for AI alignment, computational social science, and agent-based modeling. The massive experimental scale (5M+ questions) demonstrates methodological rigor. Paper 1, while valuable for PHM reproducibility, addresses a more niche engineering problem with narrower audience. Paper 2's findings about value-behavior relationships in LLMs are timely and relevant to the rapidly growing field of LLM alignment and human simulation.
Paper 1 presents a rigorous, large-scale empirical study (5M+ questions) grounded in established psychological theory, demonstrating novel findings about inducing human-like value structures in LLMs. It has broad interdisciplinary impact spanning AI, psychology, and computational social science, with clear applications in human behavior simulation. Paper 2 introduces useful conceptual frameworks (Agentic Technical Debt, Stochastic Tax) but is more of a position/governance paper without empirical validation, targeting a narrower audience of AI system managers rather than advancing fundamental scientific understanding.
Paper 1 addresses a practical and novel problem in AI agent systems—detecting infeasible tasks to reduce computational waste. It introduces a concrete pipeline (FeasiGen), validated benchmarks with 94% accuracy, new evaluation metrics, and reveals significant findings (73.9% false continue rate). This has direct implications for efficient deployment of tool-using agents, a rapidly growing field. Paper 2 contributes interesting findings on value alignment in LLMs but builds more incrementally on existing persona/role-playing research with less immediate practical impact and narrower methodological contribution.
Paper 1 bridges AI and psychology by systematically inducing and evaluating human-like values in LLMs at a massive scale. This interdisciplinary approach offers deep insights into AI alignment and human behavioral simulation, promising broader theoretical and practical impacts across multiple fields. In contrast, Paper 2 is primarily an engineering-focused technical report for new coding models; while practically useful, it lacks the profound scientific novelty and methodological innovation of Paper 1.
Paper 2 likely has higher scientific impact: it proposes broadly applicable, low-overhead guidelines and concrete methods for data ordering that can improve stability/efficiency across pre-training and SFT, affecting many LLM pipelines and reducing compute/data costs. This is timely and widely relevant to both academia and industry, with clear real-world adoption potential and reproducibility संकेत (code link, multi-scale experiments). Paper 1 is novel and large-scale, but its impact is narrower (value simulation/alignment) and more sensitive to prompt-based methodology and construct validity when mapping human value theory to LLM behavior.
Paper 2 offers broader and more timely scientific impact by addressing the critical challenge of AI alignment and human behavior simulation in LLMs. By integrating psychological value theories with large-scale LLM experiments, it bridges AI and cognitive sciences, opening avenues for sociological simulations and safer AI agents. While Paper 1 presents a rigorous and practical application of graph neural networks to clinical disease prediction, Paper 2's foundational contribution to the rapidly expanding field of LLM behavior provides wider multi-disciplinary applicability and addresses an immediate, global AI research priority.
Paper 2 likely has higher scientific impact due to a clearer enabling contribution: a scalable, validated pipeline and dynamic benchmark with executable trajectories for long-horizon web agents—directly addressing a major bottleneck (process-level supervision). It offers broad real-world applicability (web assistants, automation), strong methodological emphasis (deterministic replays, systematic validation), and high timeliness as agentic LLMs are a fast-moving area where benchmarks drive progress. Paper 1 is novel and large-scale, but its impact may be narrower (value simulation/alignment) and more dependent on prompt-based induction rather than a reusable infrastructure artifact.
ProjectionBench introduces a novel benchmark framework for evaluating LLMs' scientific hypothesis generation and reasoning capabilities under progressive information disclosure — a unique and timely contribution as AI-for-science accelerates. It addresses a critical gap (evaluating innovative reasoning vs. mere retrieval), tests cutting-edge models (GPT-5, Gemini 3.1), and has broad applicability across scientific domains. Paper 1, while methodologically rigorous, builds incrementally on existing work in value alignment and LLM persona simulation, with narrower applicability primarily in social science simulation. Paper 2's framework has greater potential to shape AI-driven scientific discovery.