AutoPK: Leveraging LLMs and a Hybrid Similarity Metric for Advanced Retrieval of Pharmacokinetic Data from Complex Tables and Documents

Pharmacokinetics (PK) plays a critical role in drug development and regulatory decision-making for human and veterinary medicine, directly affecting public health through drug safety and efficacy assessments. However, PK data are often embedded in complex, heterogeneous tables with variable structures and inconsistent terminologies, posing significant challenges for automated PK data retrieval and standardization. AutoPK, a novel two-stage framework for accurate and scalable extraction of PK data from complex scientific tables. In the first stage, AutoPK identifies and extracts PK parameter variants using large language models (LLMs), a hybrid similarity metric, and LLM-based validation. The second stage filters relevant rows, converts the table into a key-value text format, and uses an LLM to reconstruct a standardized table. Evaluated on a real-world dataset of 605 PK tables, including captions and footnotes, AutoPK shows significant improvements in precision and recall over direct LLM baselines. For instance, AutoPK with LLaMA 3.1-70B achieved an F1-score of 0.92 on half-life and 0.91 on clearance parameters, outperforming direct use of LLaMA 3.1-70B by margins of 0.10 and 0.21, respectively. Smaller models such as Gemma 3-27B and Phi 3-12B with AutoPK achieved 2-7 fold F1 gains over their direct use, with Gemma's hallucination rates reduced from 60-95% down to 8-14%. Notably, AutoPK enabled open-source models like Gemma 3-27B to outperform commercial systems such as GPT-4o Mini on several PK parameters. AutoPK enables scalable and high-confidence PK data extraction, making it well-suited for critical applications in veterinary pharmacology, drug safety monitoring, and public health decision-making, while addressing heterogeneous table structures and terminology and demonstrating generalizability across key PK parameters. Code and data: https://github.com/hosseinsholehrasa/AutoPK

翻译：药代动力学（PK）在人类与兽医学的药物研发及监管决策中发挥关键作用，通过药物安全性与有效性评估直接影响公共卫生。然而，PK数据常嵌入于结构多变、术语不统一的复杂异构表格中，给自动化PK数据检索与标准化带来重大挑战。本文提出AutoPK——一种面向复杂科学表格，实现精准且可扩展的PK数据提取的新型两阶段框架。第一阶段，AutoPK利用大语言模型（LLMs）、混合相似度度量及基于LLM的验证机制，识别并提取PK参数变体；第二阶段过滤相关行，将表格转换为键值文本格式，并通过LLM重构标准化表格。在包含605个PK表格（含标题与脚注）的真实数据集上评估表明，AutoPK在精确率与召回率上较直接应用LLM基线方法有显著提升。例如，基于LLaMA 3.1-70B的AutoPK在半衰期参数上F1分数达0.92，清除率参数达0.91，分别较直接使用LLaMA 3.1-70B提升0.10和0.21。结合AutoPK的小型模型如Gemma 3-27B与Phi 3-12B，其F1分数较直接使用提升2至7倍，且Gemma的幻觉率从60-95%降至8-14%。值得注意的是，AutoPK使Gemma 3-27B等开源模型在多项PK参数上超越GPT-4o Mini等商业系统。AutoPK实现了可扩展且高置信度的PK数据提取，特别适用于兽医药理学、药物安全监测及公共卫生决策等关键应用，同时有效应对异构表格结构与术语差异，并在关键PK参数间展现出良好泛化能力。代码与数据：https://github.com/hosseinsholehrasa/AutoPK