Event-Centric Human Value Understanding in News-Domain Texts: An Actor-Conditioned, Multi-Granularity Benchmark

Existing human value datasets do not directly support value understanding in factual news: many are actor-agnostic, rely on isolated utterances or synthetic scenarios, and lack explicit event structure or value direction. We present \textbf{NEVU} (\textbf{N}ews \textbf{E}vent-centric \textbf{V}alue \textbf{U}nderstanding), a benchmark for \emph{actor-conditioned}, \emph{event-centric}, and \emph{direction-aware} human value recognition in factual news. NEVU evaluates whether models can identify value cues, attribute them to the correct actor, and determine value direction from grounded evidence. Built from 2{,}865 English news articles, NEVU organizes annotations at four semantic unit levels (\textbf{Subevent}, \textbf{behavior-based composite event}, \textbf{story-based composite event}, and \textbf{Article}) and labels \mbox{(unit, actor)} pairs for fine-grained evaluation across local and composite contexts. The annotations are produced through an LLM-assisted pipeline with staged verification and targeted human auditing. Using a hierarchical value space with \textbf{54} fine-grained values and \textbf{20} coarse-grained categories, NEVU covers 45{,}793 unit--actor pairs and 168{,}061 directed value instances. We provide unified baselines for proprietary and open-source LLMs, and find that lightweight adaptation (LoRA) consistently improves open-source models, showing that although NEVU is designed primarily as a benchmark, it also supports supervised adaptation beyond prompting-only evaluation. Data availability is described in Appendix~\ref{app:data_code_availability}.

翻译：现有的人类价值数据集难以直接支撑事实新闻中的价值理解：许多数据集忽视行动者（actor-agnostic），依赖于孤立的话语或合成场景，并且缺乏明确的事件结构或价值方向。我们提出了\textbf{NEVU}（\textbf{N}ews \textbf{E}vent-centric \textbf{V}alue \textbf{U}nderstanding），这是一个面向事实新闻中\textit{行动者条件化}（actor-conditioned）\textit{事件中心}（event-centric）且\textit{方向感知}（direction-aware）的人类价值识别的基准。NEVU评估模型是否能够识别价值线索、将其归属于正确的行动者，并基于有据可依的证据确定价值方向。NEVU基于2,865篇英文新闻文章构建，在四个语义单元层级（\textbf{子事件}、\textbf{基于行为的复合事件}、\textbf{基于故事的复合事件}和\textbf{文章}）组织标注，并针对\textit{（单元，行动者）}对进行标注，以支持跨局部和复合上下文的细粒度评估。这些标注通过一个包含阶段性验证和针对性人工审计的大语言模型辅助流程生成。NEVU采用包含\textbf{54}个细粒度价值和\textbf{20}个粗粒度类别的分层价值空间，覆盖了45,793个单元-行动者对和168,061个有方向的价值实例。我们为闭源和开源大语言模型提供了统一基线，并发现轻量级适配（LoRA）能一致地提升开源模型性能，这表明尽管NEVU主要被设计为一个基准，但它也能支持超越纯提示评估的监督式适配。数据可用性说明见附录~\ref{app:data_code_availability}。