Zero-shot classification enables text to be classified into classes not seen during training. In this study, we examine the efficacy of zero-shot learning models in classifying healthcare consultation responses from Doctors and AI systems. The models evaluated include BART, BERT, XLM, XLM-R and DistilBERT. The models were tested on three different datasets based on a binary and multi-label analysis to identify the origins of text in health consultations without any prior corpus training. According to our findings, the zero-shot language models show a good understanding of language generally, but has limitations when trying to classify doctor and AI responses to healthcare consultations. This research provides a foundation for future research in the field of medical text classification by informing the development of more accurate methods of classifying text written by Doctors and AI systems in health consultations.
翻译:零样本分类技术能够对训练阶段未见过的类别文本进行分类。本研究探讨了零样本学习模型在医疗咨询场景下对医生与AI系统回复进行分类的有效性。评估的模型包括BART、BERT、XLM、XLM-R和DistilBERT。基于二分类和多标签分析,我们在三个不同数据集上测试这些模型,在不经过任何预训练语料训练的情况下,识别健康咨询文本的原始来源。研究发现,零样本语言模型虽具备良好的通用语言理解能力,但在区分医生与AI系统对健康咨询的回复时存在局限性。本研究通过揭示更精准的医疗咨询文本分类方法发展方向,为医疗文本分类领域的后续研究奠定基础。