Electroencephalogram monitoring devices and online data repositories hold large amounts of data from individuals participating in research and medical studies without direct reference to personal identifiers. This paper explores what types of personal and health information have been detected and classified within task-free EEG data. Additionally, we investigate key characteristics of the collected resting-state and sleep data, in order to determine the privacy risks involved with openly available EEG data. We used Google Scholar, Web of Science and searched relevant journals to find studies which classified or detected the presence of various disorders and personal information in resting state and sleep EEG. Only English full-text peer-reviewed journal articles or conference papers about classifying the presence of medical disorders between individuals were included. A quality analysis carried out by 3 reviewers determined general paper quality based on specified evaluation criteria. In resting state EEG, various disorders including Autism Spectrum Disorder, Parkinson's disease, and alcohol use disorder have been classified with high classification accuracy, often requiring only 5 mins of data or less. Sleep EEG tends to hold classifiable information about sleep disorders such as sleep apnea, insomnia, and REM sleep disorder, but usually involve longer recordings or data from multiple sleep stages. Many classification methods are still developing but even today, access to a person's EEG can reveal sensitive personal health information. With an increasing ability of machine learning methods to re-identify individuals from their EEG data, this review demonstrates the importance of anonymization, and the development of improved tools for keeping study participants and medical EEG users' privacy safe.
翻译:脑电图监测设备及在线数据存储库存有大量来自研究及医学实验参与者的数据,这些数据未直接关联个人身份标识。本文探讨了在无任务脑电图数据中已检测并分类的个人与健康信息类型。此外,我们分析了所收集的静息态与睡眠数据的关键特征,以评估公开脑电图数据所涉及的隐私风险。我们通过Google Scholar、Web of Science检索并查阅相关期刊,筛选出基于静息态与睡眠脑电图对各类疾病及个人信息进行分类或检测的研究。仅纳入以英语撰写、经同行评审的期刊全文或会议论文,且研究内容需涉及个体间医学疾病的分类判别。由三位评审员依据特定评估标准进行质量分析,以确定论文的整体质量。在静息态脑电图数据中,包括自闭症谱系障碍、帕金森病及酒精使用障碍在内的多种疾病已能以高分类准确率实现判别,且通常仅需5分钟或更短的数据。睡眠脑电图数据则倾向于包含可分类的睡眠障碍信息,如睡眠呼吸暂停、失眠及快速眼动睡眠行为障碍,但通常需要更长的记录时长或多睡眠阶段数据。尽管许多分类方法仍在发展中,但当前获取个体脑电图数据已能揭示敏感的个人健康信息。随着机器学习方法从脑电图数据中重新识别个体身份的能力日益增强,本综述论证了数据匿名化的重要性,并强调需开发更完善的工具以保障研究参与者及医学脑电图使用者的隐私安全。