Through the use of cutting-edge unsupervised classification techniques from statistics and machine learning, we characterise symptom phenotypes among symptomatic SARS-CoV-2 PCR-positive community cases. We first analyse each dataset in isolation and across age bands, before using methods that allow us to compare multiple datasets. While we observe separation due to the total number of symptoms experienced by cases, we also see a separation of symptoms into gastrointestinal, respiratory and other types, and different symptom co-occurrence patterns at the extremes of age. In this way, we are able to demonstrate the deep structure of symptoms of COVID-19 without usual biases due to study design. This is expected to have implications for the identification and management of community SARS-CoV-2 cases and could be further applied to symptom-based management of other diseases and syndromes.
翻译:通过利用统计学和机器学习中的前沿无监督分类技术,我们对SARS-CoV-2 PCR阳性社区病例中的症状表型进行了特征描述。我们首先独立分析每个数据集并按年龄段划分,随后采用可比较多个数据集的方法进行分析。虽然我们观察到病例症状总数存在差异,但同时也发现症状可区分为胃肠道型、呼吸道型及其他类型,并且在年龄极端病例中呈现出不同的症状共现模式。通过这种方式,我们得以揭示COVID-19症状的深层结构,避免了因研究设计而产生的常见偏倚。预计这一发现将对社区SARS-CoV-2病例的识别与管理产生重要影响,并可进一步应用于其他疾病和综合征的症状管理。