Challenging the Machinery of Generative AI with Fact-Checking: Ontology-Driven Biological Graphs for Verifying Human Disease-Gene Links

Background: Since the launch of various generative AI tools, scientists have been striving to evaluate their capabilities and contents, in the hope of establishing trust in their generative abilities. Regulations and guidelines are emerging to verify generated contents and identify novel uses. Objective: we aspire to demonstrate how ChatGPT claims are checked computationally using the rigor of network models. We aim to achieve fact-checking of the knowledge embedded in biological graphs that were contrived from ChatGPT contents at the aggregate level. Methods: We adopted a biological networks approach that enables the systematic interrogation of ChatGPT's linked entities. We designed an ontology-driven fact-checking algorithm that compares biological graphs constructed from approximately 200,000 PubMed abstracts with counterparts constructed from a dataset generated using the ChatGPT-3.5 Turbo model. Results: in 10-samples of 250 randomly selected records a ChatGPT dataset of 1000 "simulated" articles, the fact-checking link accuracy ranged from 70% to 86%. The computational process was followed by a manual process using IntAct Interaction database and the Gene regulatory network database (GRNdb) to confirm the validity of the links identified computationally. We also found that the proximity of the edges of ChatGPT graphs were significantly shorter (90 -- 153) while literature distances were (236 -- 765). This pattern held true in all 10-samples. Conclusion: This study demonstrated high accuracy of aggregate disease-gene links relationships found in ChatGPT-generated texts. The strikingly consistent pattern offers an illuminate new biological pathways that may open the door for new research opportunities.

翻译：背景：自各类生成式AI工具发布以来，科学家们一直致力于评估其能力与内容，以期建立对其生成能力的信任。相关法规与指南不断涌现，旨在核查生成内容并识别新型应用。目标：我们旨在展示如何利用网络模型的严谨性对ChatGPT的声明进行计算化核查。我们致力于对由ChatGPT内容聚合生成的生物图所嵌入的知识进行事实核查。方法：我们采用生物网络方法，系统化地检视ChatGPT的关联实体。我们设计了一种基于本体驱动的事实核查算法，将约20万篇PubMed摘要构建的生物图与使用ChatGPT-3.5 Turbo模型生成的数据集对应的图进行对比。结果：在10个随机选取的250条记录样本中，一个包含1000篇"模拟"文章的ChatGPT数据集，其事实核查链路准确率介于70%至86%之间。计算流程之后辅以人工流程，使用IntAct相互作用数据库和基因调控网络数据库（GRNdb）来验证计算识别的关联的有效性。我们还发现，ChatGPT图的边距显著更短（90-153），而文献距离为（236-765）。这一模式在所有10个样本中保持一致。结论：本研究表明，ChatGPT生成的文本中疾病-基因聚合关联具有高准确性。这一显著一致的模式揭示了潜在的生物通路，可能为新研究机遇打开大门。