A clinical trial is a study that evaluates new biomedical interventions. To design new trials, researchers draw inspiration from those current and completed. In 2022, there were on average more than 100 clinical trials submitted to ClinicalTrials.gov every day, with each trial having a mean of approximately 1500 words [1]. This makes it nearly impossible to keep up to date. To mitigate this issue, we have created a batch clinical trial summarizer called CliniDigest using GPT-3.5. CliniDigest is, to our knowledge, the first tool able to provide real-time, truthful, and comprehensive summaries of clinical trials. CliniDigest can reduce up to 85 clinical trial descriptions (approximately 10,500 words) into a concise 200-word summary with references and limited hallucinations. We have tested CliniDigest on its ability to summarize 457 trials divided across 27 medical subdomains. For each field, CliniDigest generates summaries of $\mu=153,\ \sigma=69 $ words, each of which utilizes $\mu=54\%,\ \sigma=30\% $ of the sources. A more comprehensive evaluation is planned and outlined in this paper.
翻译:临床试验是一种评估新型生物医学干预措施的研究。为设计新的试验,研究人员从当前及已完成的试验中汲取灵感。2022年,平均每天有超过100项临床试验提交至ClinicalTrials.gov,每项试验的文本量平均约为1500字[1]。这使得及时掌握相关信息几乎不可能。为缓解这一问题,我们利用GPT-3.5创建了一个批量临床试验摘要工具,名为CliniDigest。据我们所知,CliniDigest是首个能够实时、真实、全面地提供临床试验摘要的工具。该工具可将多达85份临床试验描述(约10,500字)压缩为一份约200字的简洁摘要,并附带参考文献,且幻觉现象有限。我们测试了CliniDigest对分属27个医学子领域的457项试验的摘要能力。在每一领域中,CliniDigest生成的摘要平均字数为$\mu=153,\ \sigma=69$,其中平均利用了$\mu=54\%,\ \sigma=30\%$的源材料。本文计划并概述了一项更全面的评估。