Rapid discovery of new diseases, such as COVID-19 can enable a timely epidemic response, preventing the large-scale spread and protecting public health. However, limited research efforts have been taken on this problem. In this paper, we propose a contrastive learning-based modeling approach for COVID-19 coughing and breathing pattern discovery from non-COVID coughs. To validate our models, extensive experiments have been conducted using four large audio datasets and one image dataset. We further explore the effects of different factors, such as domain relevance and augmentation order on the pre-trained models. Our results show that the proposed model can effectively distinguish COVID-19 coughing and breathing from unlabeled data and labeled non-COVID coughs with an accuracy of up to 0.81 and 0.86, respectively. Findings from this work will guide future research to detect an outbreak of a new disease early.
翻译:新发疾病(如COVID-19)的快速发现有助于及时实施疫情应对措施,防止大规模传播并保护公共卫生。然而,该问题目前鲜有研究关注。本文提出一种基于对比学习的建模方法,用于从非COVID咳嗽声中识别COVID-19咳嗽与呼吸模式。为验证模型有效性,我们利用四个大规模音频数据集与一个图像数据集开展了广泛实验,并进一步探究了域相关性、增强顺序等不同因素对预训练模型的影响。结果表明,所提模型能够有效区分未标注数据中的COVID-19咳嗽与呼吸,以及标注的非COVID咳嗽声,准确率分别高达0.81与0.86。本研究的发现将为未来早期检测新发疾病暴发的研究提供指导。