This paper introduces the RAG-RLRC-LaySum framework, designed to make complex biomedical research understandable to laymen through advanced Natural Language Processing (NLP) techniques. Our Retrieval Augmented Generation (RAG) solution, enhanced by a reranking method, utilizes multiple knowledge sources to ensure the precision and pertinence of lay summaries. Additionally, our Reinforcement Learning for Readability Control (RLRC) strategy improves readability, making scientific content comprehensible to non-specialists. Evaluations using the publicly accessible PLOS and eLife datasets show that our methods surpass Plain Gemini model, demonstrating a 20% increase in readability scores, a 15% improvement in ROUGE-2 relevance scores, and a 10% enhancement in factual accuracy. The RAG-RLRC-LaySum framework effectively democratizes scientific knowledge, enhancing public engagement with biomedical discoveries.
翻译:本文介绍了RAG-RLRC-LaySum框架,旨在通过先进的自然语言处理技术使复杂的生物医学研究能够为普通大众所理解。我们通过重排序方法增强的检索增强生成方案,利用多个知识源以确保通俗摘要的准确性与相关性。此外,我们的强化学习可读性控制策略提升了文本可读性,使科学内容对非专业人士更易理解。使用公开可访问的PLOS和eLife数据集进行的评估表明,我们的方法超越了Plain Gemini模型,在可读性分数上实现了20%的提升,在ROUGE-2相关性分数上提高了15%,并在事实准确性上增强了10%。RAG-RLRC-LaySum框架有效地促进了科学知识的普及,增强了公众对生物医学发现的参与度。