Pharmacokinetics (PK) plays a critical role in drug development and regulatory decision-making for human and veterinary medicine, directly affecting public health through drug safety and efficacy assessments. However, PK data are often embedded in complex, heterogeneous tables with variable structures and inconsistent terminologies, posing significant challenges for automated PK data retrieval and standardization. AutoPK, a novel two-stage framework for accurate and scalable extraction of PK data from complex scientific tables. In the first stage, AutoPK identifies and extracts PK parameter variants using large language models (LLMs), a hybrid similarity metric, and LLM-based validation. The second stage filters relevant rows, converts the table into a key-value text format, and uses an LLM to reconstruct a standardized table. Evaluated on a real-world dataset of 605 PK tables, including captions and footnotes, AutoPK shows significant improvements in precision and recall over direct LLM baselines. For instance, AutoPK with LLaMA 3.1-70B achieved an F1-score of 0.92 on half-life and 0.91 on clearance parameters, outperforming direct use of LLaMA 3.1-70B by margins of 0.10 and 0.21, respectively. Smaller models such as Gemma 3-27B and Phi 3-12B with AutoPK achieved 2-7 fold F1 gains over their direct use, with Gemma's hallucination rates reduced from 60-95% down to 8-14%. Notably, AutoPK enabled open-source models like Gemma 3-27B to outperform commercial systems such as GPT-4o Mini on several PK parameters. AutoPK enables scalable and high-confidence PK data extraction, making it well-suited for critical applications in veterinary pharmacology, drug safety monitoring, and public health decision-making, while addressing heterogeneous table structures and terminology and demonstrating generalizability across key PK parameters. Code and data: https://github.com/hosseinsholehrasa/AutoPK
翻译:药代动力学(PK)在人类和兽医药物研发及监管决策中起着至关重要的作用,通过药物安全性和有效性评估直接影响公众健康。然而,PK数据通常嵌入在结构多变、术语不一致的复杂异质表格中,这给自动化PK数据检索与标准化带来了重大挑战。AutoPK是一种新颖的两阶段框架,用于从复杂科学表格中准确且可扩展地提取PK数据。在第一阶段,AutoPK利用大语言模型、混合相似度度量以及基于LLM的验证来识别和提取PK参数变体。第二阶段筛选相关行,将表格转换为键值文本格式,并使用LLM重建标准化表格。在包含标题和脚注的605个真实世界PK表格数据集上的评估表明,AutoPK在精确率和召回率上较直接使用LLM的基线方法有显著提升。例如,采用LLaMA 3.1-70B的AutoPK在半衰期和清除率参数上分别取得了0.92和0.91的F1分数,优于直接使用LLaMA 3.1-70B,提升幅度分别为0.10和0.21。较小模型如Gemma 3-27B和Phi 3-12B结合AutoPK后,相比直接使用获得了2-7倍的F1分数提升,且Gemma的幻觉率从60-95%降至8-14%。值得注意的是,AutoPK使Gemma 3-27B等开源模型在多个PK参数上超越了GPT-4o Mini等商业系统。AutoPK实现了可扩展且高置信度的PK数据提取,使其非常适用于兽医药理学、药物安全监测和公共卫生决策等关键应用,同时解决了异质表格结构和术语问题,并在关键PK参数上展现了良好的泛化能力。代码与数据:https://github.com/hosseinsholehrasa/AutoPK