Automated coaching messages for weight control can save time and costs, but their repetitive, generic nature may limit their effectiveness compared to human coaching. Large language model (LLM) based artificial intelligence (AI) chatbots, like ChatGPT, could offer more personalized and novel messages to address repetition with their data-processing abilities. While LLM AI demonstrates promise to encourage healthier lifestyles, studies have yet to examine the feasibility and acceptability of LLM-based BWL coaching. 87 adults in a weight-loss trial rated ten coaching messages' helpfulness (five human-written, five ChatGPT-generated) using a 5-point Likert scale, providing additional open-ended feedback to justify their ratings. Participants also identified which messages they believed were AI-generated. The evaluation occurred in two phases: messages in Phase 1 were perceived as impersonal and negative, prompting revisions for Phase 2 messages. In Phase 1, AI-generated messages were rated less helpful than human-written ones, with 66 percent receiving a helpfulness rating of 3 or higher. However, in Phase 2, the AI messages matched the human-written ones regarding helpfulness, with 82% scoring three or above. Additionally, 50% were misidentified as human-written, suggesting AI's sophistication in mimicking human-generated content. A thematic analysis of open-ended feedback revealed that participants appreciated AI's empathy and personalized suggestions but found them more formulaic, less authentic, and too data-focused. This study reveals the preliminary feasibility and acceptability of LLM AIs, like ChatGPT, in crafting potentially effective weight control coaching messages. Our findings also underscore areas for future enhancement.
翻译:自动化减重辅导信息可节省时间和成本,但其重复性和通用性可能限制其相较于人类辅导的效力。基于大语言模型的AI聊天机器人(如ChatGPT)凭借其数据处理能力,能生成更个性化、更具新意的信息以缓解重复性问题。尽管LLM AI在促进健康生活方式方面展现出潜力,但现有研究尚未检验基于LLM的行为减重辅导的可行性与可接受性。87名参与减重试验的成年人采用5点李克特量表对十条辅导信息的帮助程度进行评分(五条为人类撰写,五条由ChatGPT生成),并提供开放式反馈以佐证其评分。参与者还需判断哪些信息由AI生成。评估分两阶段进行:第一阶段的信息被认为缺乏个性且带有负面倾向,据此修订生成第二阶段信息。第一阶段中,AI生成信息的帮助评分低于人类撰写信息,66%的信息获得3分及以上评价;而在第二阶段,AI信息与人类信息的帮助评分持平,82%的信息得分高于3分。此外,50%的AI信息被误判为人类撰写,表明AI在模仿人类内容方面已具备较高水平。对开放式反馈的主题分析显示,参与者认可AI的共情能力和个性化建议,但认为其内容更模板化、缺乏真实感且过度侧重数据。本研究揭示了ChatGPT等LLM AI在生成有效减重辅导信息方面的初步可行性与可接受性,同时指出未来需改进的方向。