Can large language models interpret unstructured chat data on dynamic group decision-making processes? Evidence on joint destination choice

Social activities result from complex joint activity-travel decisions between group members. While observing the decision-making process of these activities is difficult via traditional travel surveys, the advent of new types of data, such as unstructured chat data, can help shed some light on these complex processes. However, interpreting these decision-making processes requires inferring both explicit and implicit factors. This typically involves the labor-intensive task of manually annotating dialogues to capture context-dependent meanings shaped by the social and cultural norms. This study evaluates the potential of Large Language Models (LLMs) to automate and complement human annotation in interpreting decision-making processes from group chats, using data on joint eating-out activities in Japan as a case study. We designed a prompting framework inspired by the knowledge acquisition process, which sequentially extracts key decision-making factors, including the group-level restaurant choice set and outcome, individual preferences of each alternative, and the specific attributes driving those preferences. This structured process guides the LLM to interpret group chat data, converting unstructured dialogues into structured tabular data describing decision-making factors. To evaluate LLM-driven outputs, we conduct a quantitative analysis using a human-annotated ground truth dataset and a qualitative error analysis to examine model limitations. Results show that while the LLM reliably captures explicit decision-making factors, it struggles to identify nuanced implicit factors that human annotators readily identified. We pinpoint specific contexts when LLM-based extraction can be trusted versus when human oversight remains essential. These findings highlight both the potential and limitations of LLM-based analysis for incorporating non-traditional data sources on social activities.

翻译：社交活动源于群体成员之间复杂的联合活动-出行决策。虽然通过传统出行调查难以观察这些活动的决策过程，但新型数据（如非结构化聊天数据）的出现有助于揭示这些复杂过程。然而，解读这些决策过程需要推断显性和隐性因素。这通常涉及人工标注对话的繁重任务，以捕捉由社会文化规范塑造的语境依赖含义。本研究以日本联合外出就餐活动数据为案例，评估大型语言模型在解读群聊决策过程中自动化及补充人工标注的潜力。我们设计了一个受知识获取过程启发的提示框架，该框架能顺序提取关键决策因素，包括群体层级的餐厅选择集与结果、各备选方案的个体偏好，以及驱动这些偏好的具体属性。这一结构化流程引导大型语言模型解读群聊数据，将非结构化对话转化为描述决策因素的结构化表格数据。为评估大型语言模型的输出质量，我们采用人工标注的真实数据集进行定量分析，并通过定性错误分析检验模型局限。结果表明，虽然大型语言模型能可靠捕捉显性决策因素，但在识别人类标注者易于察觉的微妙隐性因素方面存在困难。我们明确了基于大型语言模型的提取方法可被信赖的具体情境，以及仍需人工监督的关键场景。这些发现凸显了基于大型语言模型的分析方法在整合社交活动非传统数据源方面的潜力与局限。