Generative AI systems are increasingly adopted by patients seeking everyday health guidance, yet their reliability and clinical appropriateness remain uncertain. Taking Type 2 Diabetes Mellitus (T2DM) as a representative chronic condition, this paper presents a two-part mixed-methods study that examines how patients and physicians in China evaluate the quality and usability of AI-generated health information. Study~1 analyzes 784 authentic patient questions to identify seven core categories of informational needs and five evaluation dimensions -- \textit{Accuracy, Safety, Clarity, Integrity}, and \textit{Action Orientation}. Study~2 involves seven endocrinologists who assess responses from four mainstream AI models across these dimensions. Quantitative and qualitative findings reveal consistent strengths in factual and lifestyle guidance but significant weaknesses in medication interpretation, contextual reasoning, and empathy. Patients view AI as an accessible ``pre-visit educator,'' whereas clinicians highlight its lack of clinical safety and personalization. Together, the findings inform design implications for interactive health systems, advocating for multi-model orchestration, risk-aware fallback mechanisms, and emotionally attuned communication to ensure trustworthy AI assistance in chronic disease care.
翻译:生成式人工智能系统正日益被寻求日常健康指导的患者所采用,但其可靠性与临床适用性仍不明确。本文以2型糖尿病(T2DM)这一典型慢性病为例,开展了一项包含两部分的混合方法研究,旨在探究中国患者和医生如何评估AI生成健康信息的质量与可用性。研究~1分析了784条真实患者提问,识别出信息需求的七个核心类别及五个评估维度——\textit{准确性、安全性、清晰性、完整性}与\textit{行动导向性}。研究~2邀请了七位内分泌科医生,依据这些维度对四种主流AI模型的回答进行评估。定量与定性研究结果一致表明,AI在事实性知识与生活方式指导方面表现稳定,但在药物解读、情境推理及共情能力方面存在显著不足。患者将AI视为便捷的“诊前教育者”,而临床医生则着重指出其缺乏临床安全性与个性化适配。综合而言,本研究为交互式健康系统的设计提供了启示,主张采用多模型协同、风险感知的备用机制以及情感适配的沟通方式,以确保持续性疾病照护中AI辅助的可信赖性。