This paper presents a novel approach to accurately classify the hallmarks of cancer, which is a crucial task in cancer research. Our proposed method utilizes the Bidirectional Encoder Representations from Transformers (BERT) architecture, which has shown exceptional performance in various downstream applications. By applying transfer learning, we fine-tuned the pre-trained BERT model on a small corpus of biomedical text documents related to cancer. The outcomes of our experimental investigations demonstrate that our approach attains a noteworthy accuracy of 94.45%, surpassing almost all prior findings with a substantial increase of at least 8.04% as reported in the literature. These findings highlight the effectiveness of our proposed model in accurately classifying and comprehending text documents for cancer research, thus contributing significantly to the field. As cancer remains one of the top ten leading causes of death globally, our approach holds great promise in advancing cancer research and improving patient outcomes.
翻译:本文提出了一种新颖的方法,用于准确分类癌症特征,这是癌症研究中的关键任务。我们提出的方法利用双向编码器表示变换器(BERT)架构,该架构在多种下游应用中表现出色。通过应用迁移学习,我们在与癌症相关的小规模生物医学文本语料库上对预训练的BERT模型进行了微调。实验研究结果表明,我们的方法达到了94.45%的显著准确率,超越了过去几乎所有研究报告的结果,性能提升了至少8.04%。这些发现凸显了所提模型在准确分类和理解癌症研究文本方面的有效性,从而为该领域做出了重要贡献。鉴于癌症仍然是全球十大主要死亡原因之一,我们的方法在推动癌症研究和改善患者预后方面具有巨大潜力。