Videos can be an effective way to deliver contextualized, just-in-time medical information for patient education. However, video analysis, from topic identification and retrieval to extraction and analysis of medical information and understandability from a patient perspective are extremely challenging tasks. This study demonstrates a data analysis pipeline that utilizes methods to retrieve medical information from YouTube videos on preparing for a colonoscopy exam, a much maligned and disliked procedure that patients find challenging to get adequately prepared for. We first use the YouTube Data API to collect metadata of desired videos on select search keywords and use Google Video Intelligence API to analyze texts, frames and objects data. Then we annotate the YouTube video materials on medical information, video understandability and overall recommendation. We develop a bidirectional long short-term memory (BiLSTM) model to identify medical terms in videos and build three classifiers to group videos based on the levels of encoded medical information and video understandability, and whether the videos are recommended or not. Our study provides healthcare stakeholders with guidelines and a scalable approach for generating new educational video content to enhance management of a vast number of health conditions.
翻译:视频可作为传递情境化、即时医疗信息的有效途径,用于患者教育。然而,从主题识别与检索到医疗信息提取分析,以及从患者视角评估视频可理解性,皆是极具挑战性的任务。本研究构建了一套数据分析流程,利用多种方法从YouTube平台中关于结肠镜检查准备的视频中提取医疗信息。结肠镜检查作为一项备受诟病且令患者抵触的医疗程序,其充分准备工作常使患者感到困难。我们首先通过YouTube数据接口,基于选定关键词采集目标视频的元数据,并运用谷歌视频智能接口分析文本、画面帧及对象数据。随后对YouTube视频材料在医疗信息含量、视频可理解性及整体推荐度三个维度进行人工标注。我们开发了双向长短期记忆网络模型以识别视频中的医学术语,并构建三个分类器,分别依据编码医疗信息层级、视频可理解性等级以及视频是否被推荐对视频进行归类。本研究为医疗健康相关方提供了指导原则与可扩展的方法论,以生成新型教育视频内容,从而提升对大量健康状况的管理能力。