Employing General-Purpose and Biomedical Large Language Models with Advanced Prompt Engineering for Pharmacoepidemiologic Study Design - 专知论文

会员服务 ·

0

Prompt · 设计 · 语言模型化 · MoDELS · Engineering ·

Employing General-Purpose and Biomedical Large Language Models with Advanced Prompt Engineering for Pharmacoepidemiologic Study Design

翻译：暂无翻译

Xinyao Zhang,Nicole Sonne Heckmann,Manuela Del Castillo Suero,Francesco Paolo Speca,Maurizio Sessa

Background: The potential of large language models (LLMs) to automate and support pharmacoepidemiologic study design is an emerging area of interest, yet their reliability remains insufficiently characterized. General-purpose LLMs often display inaccuracies, while the comparative performance of specialized biomedical LLMs in this domain remains unknown. Methods: This study evaluated general-purpose LLMs (GPT-4o and DeepSeek-R1) versus biomedically fine-tuned LLMs (QuantFactory/Bio-Medical-Llama-3-8B-GGUF and Irathernotsay/qwen2-1.5B-medical_qa-Finetune) using 46 protocols (2018-2024) from the HMA-EMA Catalogue and Sentinel System. Performance was assessed across relevance, logic of justification, and ontology-code agreement across multiple coding systems using Least-to-Most (LTM) and Active Prompting strategies. Results: GPT-4o and DeepSeek-R1 paired with LTM prompting achieved the highest relevance and logic of justification scores, with GPT-4o-LTM reaching a median relevance score of 4 in 8 of 9 questions for HMA-EMA protocols. Biomedical LLMs showed lower relevance overall and frequently generated insufficient justification. All LLMs demonstrated limited proficiency in ontology-code mapping, although LTM provided the most consistent improvements in reasoning stability. Conclusion: Off-the-shelf general-purpose LLMs currently offer superior support for pharmacoepidemiologic design compared to biomedical LLMs. Prompt strategy strongly influenced LLM performance.

翻译：暂无翻译

0

相关内容

Prompt

基于大语言模型的医疗推理研究：综述与 MR-Bench 基准测试

基于大语言模型的医疗推理研究：综述与 MR-Bench 基准测试

专知会员服务

15+阅读 · 4月13日

科学大语言模型综述：从数据基础到智能体前沿

科学大语言模型综述：从数据基础到智能体前沿

专知会员服务

51+阅读 · 2025年9月1日

【ETZH博士论文】语言模型编程

【ETZH博士论文】语言模型编程

专知会员服务

25+阅读 · 2025年6月14日

从自动化到自主性：大型语言模型在科学发现中的应用综述

从自动化到自主性：大型语言模型在科学发现中的应用综述

专知会员服务

26+阅读 · 2025年5月20日

【MIT博士论文】医学人工智能中的自然语言基础模型

【MIT博士论文】医学人工智能中的自然语言基础模型

专知会员服务

15+阅读 · 2025年4月2日

【MIT博士论文】迈向人工神经科学：语言模型可解释性分析方法

【MIT博士论文】迈向人工神经科学：语言模型可解释性分析方法

专知会员服务

28+阅读 · 2025年4月1日

大规模语言模型在生物信息学中的应用

大规模语言模型在生物信息学中的应用

专知会员服务

18+阅读 · 2025年1月16日

LLM4SR：关于大规模语言模型在科学研究中的应用综述

LLM4SR：关于大规模语言模型在科学研究中的应用综述

专知会员服务

42+阅读 · 2025年1月9日

Nat. Med. | 医学中的大型语言模型

Nat. Med. | 医学中的大型语言模型

专知会员服务

58+阅读 · 2023年9月19日

【AAAI2020接受论文】隐式关系语言模型，CMU&微软，Latent Relation Language Models

【AAAI2020接受论文】隐式关系语言模型，CMU&微软，Latent Relation Language Models

专知会员服务

54+阅读 · 2019年11月12日

NLP领域最近比较火的Prompt，能否借鉴到多模态领域？一文跟进最新进展

NLP领域最近比较火的Prompt，能否借鉴到多模态领域？一文跟进最新进展

PaperWeekly

17+阅读 · 2022年3月8日

NLP大牛Thomas Wolf等新书《Transformer自然语言处理》，466页pdf及代码

NLP大牛Thomas Wolf等新书《Transformer自然语言处理》，466页pdf及代码

专知

36+阅读 · 2022年2月7日

【复旦大学】最新《预训练语言模型》2020综述论文大全，50+PTMs分类体系，25页pdf205篇参考文献

【复旦大学】最新《预训练语言模型》2020综述论文大全，50+PTMs分类体系，25页pdf205篇参考文献

专知

22+阅读 · 2020年3月19日

最新必读【预训练语言模型(BERT/XLNet等)】论文，Google/微软/华为ICLR2020提交论文

最新必读【预训练语言模型(BERT/XLNet等)】论文，Google/微软/华为ICLR2020提交论文

专知

36+阅读 · 2019年9月29日

【清华大学NLP】预训练语言模型（PLM）必读论文清单，附论文PDF、源码和模型链接

【清华大学NLP】预训练语言模型（PLM）必读论文清单，附论文PDF、源码和模型链接

专知

40+阅读 · 2019年9月27日

最新推荐 | 清华NLP图神经网络GNN论文分门别类，16大应用200+篇论文

最新推荐 | 清华NLP图神经网络GNN论文分门别类，16大应用200+篇论文

THU数据派

26+阅读 · 2019年8月11日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

近期语音类前沿论文

近期语音类前沿论文

深度学习每日摘要

14+阅读 · 2019年3月17日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

自然语言处理（二）机器翻译篇 (NLP: machine translation)

自然语言处理（二）机器翻译篇 (NLP: machine translation)

DeepLearning中文论坛

12+阅读 · 2015年7月1日

面向聋儿言语康复的多模态人机交互模型及技术研究

国家自然科学基金

3+阅读 · 2015年12月31日

基于格值逻辑的语言真值α-群锁语义归结自动推理研究

国家自然科学基金

0+阅读 · 2015年12月31日

基于犹豫模糊语言信息的定性决策理论与方法

国家自然科学基金

2+阅读 · 2015年12月31日

系统科学与复杂性学报（英文版）

国家自然科学基金

12+阅读 · 2015年12月31日

学习与记忆的神经动力学研究

国家自然科学基金

1+阅读 · 2014年12月31日

中文句子语义概念图自动构建方法及应用研究

国家自然科学基金

3+阅读 · 2014年12月31日

面向词汇功能的学术文本语义识别与知识图谱构建

国家自然科学基金

5+阅读 · 2014年12月31日

基于格子Boltzmann方法的声致悬浮液滴动力学研究

国家自然科学基金

0+阅读 · 2014年12月31日

新型统计模型在精神疾病的基因、脑影像和行为数据整合中的应用

国家自然科学基金

0+阅读 · 2014年12月31日

维吾尔语单元集优化关键技术研究及其在语音识别中的应用

国家自然科学基金

0+阅读 · 2014年12月31日

MeTHanol: Modularized Thinking Language Models with Intermediate Layer Thinking, Decoding and Bootstrapping Reasoning

Arxiv

0+阅读 · 4月29日

Large language models eroding science understanding: an experimental study

Arxiv

0+阅读 · 4月28日

Large Language Model-Enhanced Relational Operators: Taxonomy, Benchmark, and Analysis

Arxiv

0+阅读 · 4月28日

Regression with Large Language Models for Materials and Molecular Property Prediction

Arxiv

0+阅读 · 4月21日

Emergent Structured Representations Support Flexible In-Context Inference in Large Language Models

Arxiv

0+阅读 · 4月20日

SimBench: Benchmarking the Ability of Large Language Models to Simulate Human Behaviors

Arxiv

0+阅读 · 4月13日

Multilingual Medical Reasoning for Question Answering with Large Language Models

Arxiv

0+阅读 · 3月30日

Evaluating the Reliability and Fidelity of Automated Judgment Systems of Large Language Models

Arxiv

0+阅读 · 3月23日

Deliberative multi-agent large language models improve clinical reasoning in ophthalmology

Arxiv

0+阅读 · 3月22日

Comparative Analysis of Large Language Models in Generating Telugu Responses for Maternal Health Queries

Arxiv

0+阅读 · 3月19日

VIP会员

文章信息

相关主题

语言模型化

最新内容

DeepSeek 版Claude Code，免费小白安装教程来了！

DeepSeek 版Claude Code，免费小白安装教程来了！

专知会员服务

7+阅读 · 5月5日

【ICML Spotlight 2026】 T²PO: 不确定性引导的探索控制框架，实现稳定多轮Agentic强化学习

【ICML Spotlight 2026】 T²PO: 不确定性引导的探索控制框架，实现稳定多轮Agentic强化学习

专知会员服务

3+阅读 · 5月5日

基础模型驱动的工业智能体：技术成熟度、能力变迁与未竟之挑战

基础模型驱动的工业智能体：技术成熟度、能力变迁与未竟之挑战

专知会员服务

3+阅读 · 5月5日

《机动炮兵的演进与未来：技术进步、历史沿革与炮兵作战前瞻》

《机动炮兵的演进与未来：技术进步、历史沿革与炮兵作战前瞻》

专知会员服务

4+阅读 · 5月5日

《火炮弹药快速效能建模：提升互操作性与技术优势》（报告）

《火炮弹药快速效能建模：提升互操作性与技术优势》（报告）

专知会员服务

5+阅读 · 5月5日

《美空军条令出版物 2-0：情报（2026版）》

《美空军条令出版物 2-0：情报（2026版）》

专知会员服务

12+阅读 · 5月5日

美陆军“飞蝇陷阱5.0”项目将新兴技术交到作战人员手中

美陆军“飞蝇陷阱5.0”项目将新兴技术交到作战人员手中

专知会员服务

4+阅读 · 5月5日

帕兰提尔 Gotham：一个游戏规则改变器

帕兰提尔 Gotham：一个游戏规则改变器

专知会员服务

6+阅读 · 5月5日

【ICML 2026】用测试时训练线性化视觉Transformer：T⁵ 实现 Softmax 注意力到线性复杂度的快速转换

【ICML 2026】用测试时训练线性化视觉Transformer：T⁵ 实现 Softmax 注意力到线性复杂度的快速转换

专知会员服务

2+阅读 · 5月5日

【AAAI 2026】大模型做知识蒸馏：CMM将LLM特征拆解给小模型协同学习

【AAAI 2026】大模型做知识蒸馏：CMM将LLM特征拆解给小模型协同学习

专知会员服务

2+阅读 · 5月5日

【ICML Spotlight 2026 】NonZero：交互引导探索的多智能体蒙特卡洛树搜索

【ICML Spotlight 2026 】NonZero：交互引导探索的多智能体蒙特卡洛树搜索

专知会员服务

8+阅读 · 5月4日

【综述】机器人学习中的世界模型：全面综述

【综述】机器人学习中的世界模型：全面综述

专知会员服务

11+阅读 · 5月4日

伊朗的导弹-无人机行动及其对美国威慑的影响

伊朗的导弹-无人机行动及其对美国威慑的影响

专知会员服务

9+阅读 · 5月4日

《未来战术无人机系统案例研究：量身定制采办策略方法》100页报告

《未来战术无人机系统案例研究：量身定制采办策略方法》100页报告

专知会员服务

9+阅读 · 5月4日

战争贩子：2026年第一季度美国对中东潜在军售激增

战争贩子：2026年第一季度美国对中东潜在军售激增

专知会员服务

7+阅读 · 5月4日

相关VIP内容

基于大语言模型的医疗推理研究：综述与 MR-Bench 基准测试

基于大语言模型的医疗推理研究：综述与 MR-Bench 基准测试

专知会员服务

15+阅读 · 4月13日

科学大语言模型综述：从数据基础到智能体前沿

科学大语言模型综述：从数据基础到智能体前沿

专知会员服务

51+阅读 · 2025年9月1日

【ETZH博士论文】语言模型编程

【ETZH博士论文】语言模型编程

专知会员服务

25+阅读 · 2025年6月14日

从自动化到自主性：大型语言模型在科学发现中的应用综述

从自动化到自主性：大型语言模型在科学发现中的应用综述

专知会员服务

26+阅读 · 2025年5月20日

【MIT博士论文】医学人工智能中的自然语言基础模型

【MIT博士论文】医学人工智能中的自然语言基础模型

专知会员服务

15+阅读 · 2025年4月2日

【MIT博士论文】迈向人工神经科学：语言模型可解释性分析方法

【MIT博士论文】迈向人工神经科学：语言模型可解释性分析方法

专知会员服务

28+阅读 · 2025年4月1日

大规模语言模型在生物信息学中的应用

大规模语言模型在生物信息学中的应用

专知会员服务

18+阅读 · 2025年1月16日

LLM4SR：关于大规模语言模型在科学研究中的应用综述

LLM4SR：关于大规模语言模型在科学研究中的应用综述

专知会员服务

42+阅读 · 2025年1月9日

Nat. Med. | 医学中的大型语言模型

Nat. Med. | 医学中的大型语言模型

专知会员服务

58+阅读 · 2023年9月19日

【AAAI2020接受论文】隐式关系语言模型，CMU&微软，Latent Relation Language Models

【AAAI2020接受论文】隐式关系语言模型，CMU&微软，Latent Relation Language Models

专知会员服务

54+阅读 · 2019年11月12日

热门VIP内容

开通专知VIP会员享更多权益服务

【ICML Spotlight 2026】 T²PO: 不确定性引导的探索控制框架，实现稳定多轮Agentic强化学习

《机动炮兵的演进与未来：技术进步、历史沿革与炮兵作战前瞻》

DeepSeek 版Claude Code，免费小白安装教程来了！

基础模型驱动的工业智能体：技术成熟度、能力变迁与未竟之挑战

相关资讯

NLP领域最近比较火的Prompt，能否借鉴到多模态领域？一文跟进最新进展

NLP领域最近比较火的Prompt，能否借鉴到多模态领域？一文跟进最新进展

PaperWeekly

17+阅读 · 2022年3月8日

NLP大牛Thomas Wolf等新书《Transformer自然语言处理》，466页pdf及代码

NLP大牛Thomas Wolf等新书《Transformer自然语言处理》，466页pdf及代码

专知

36+阅读 · 2022年2月7日

【复旦大学】最新《预训练语言模型》2020综述论文大全，50+PTMs分类体系，25页pdf205篇参考文献

【复旦大学】最新《预训练语言模型》2020综述论文大全，50+PTMs分类体系，25页pdf205篇参考文献

专知

22+阅读 · 2020年3月19日

最新必读【预训练语言模型(BERT/XLNet等)】论文，Google/微软/华为ICLR2020提交论文

最新必读【预训练语言模型(BERT/XLNet等)】论文，Google/微软/华为ICLR2020提交论文

专知

36+阅读 · 2019年9月29日

【清华大学NLP】预训练语言模型（PLM）必读论文清单，附论文PDF、源码和模型链接

【清华大学NLP】预训练语言模型（PLM）必读论文清单，附论文PDF、源码和模型链接

专知

40+阅读 · 2019年9月27日

最新推荐 | 清华NLP图神经网络GNN论文分门别类，16大应用200+篇论文

最新推荐 | 清华NLP图神经网络GNN论文分门别类，16大应用200+篇论文

THU数据派

26+阅读 · 2019年8月11日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

近期语音类前沿论文

近期语音类前沿论文

深度学习每日摘要

14+阅读 · 2019年3月17日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

自然语言处理（二）机器翻译篇 (NLP: machine translation)

自然语言处理（二）机器翻译篇 (NLP: machine translation)

DeepLearning中文论坛

12+阅读 · 2015年7月1日

相关论文

MeTHanol: Modularized Thinking Language Models with Intermediate Layer Thinking, Decoding and Bootstrapping Reasoning

Arxiv

0+阅读 · 4月29日

Large language models eroding science understanding: an experimental study

Arxiv

0+阅读 · 4月28日

Large Language Model-Enhanced Relational Operators: Taxonomy, Benchmark, and Analysis

Arxiv

0+阅读 · 4月28日

Regression with Large Language Models for Materials and Molecular Property Prediction

Arxiv

0+阅读 · 4月21日

Emergent Structured Representations Support Flexible In-Context Inference in Large Language Models

Arxiv

0+阅读 · 4月20日

SimBench: Benchmarking the Ability of Large Language Models to Simulate Human Behaviors

Arxiv

0+阅读 · 4月13日

Multilingual Medical Reasoning for Question Answering with Large Language Models

Arxiv

0+阅读 · 3月30日

Evaluating the Reliability and Fidelity of Automated Judgment Systems of Large Language Models

Arxiv

0+阅读 · 3月23日

Deliberative multi-agent large language models improve clinical reasoning in ophthalmology

Arxiv

0+阅读 · 3月22日

Comparative Analysis of Large Language Models in Generating Telugu Responses for Maternal Health Queries

Arxiv

0+阅读 · 3月19日

相关基金

面向聋儿言语康复的多模态人机交互模型及技术研究

国家自然科学基金

3+阅读 · 2015年12月31日

基于格值逻辑的语言真值α-群锁语义归结自动推理研究

国家自然科学基金

0+阅读 · 2015年12月31日

基于犹豫模糊语言信息的定性决策理论与方法

国家自然科学基金

2+阅读 · 2015年12月31日

系统科学与复杂性学报（英文版）

国家自然科学基金

12+阅读 · 2015年12月31日

学习与记忆的神经动力学研究

国家自然科学基金

1+阅读 · 2014年12月31日

中文句子语义概念图自动构建方法及应用研究

国家自然科学基金

3+阅读 · 2014年12月31日

面向词汇功能的学术文本语义识别与知识图谱构建

国家自然科学基金

5+阅读 · 2014年12月31日

基于格子Boltzmann方法的声致悬浮液滴动力学研究

国家自然科学基金

0+阅读 · 2014年12月31日

新型统计模型在精神疾病的基因、脑影像和行为数据整合中的应用

国家自然科学基金

0+阅读 · 2014年12月31日

维吾尔语单元集优化关键技术研究及其在语音识别中的应用

国家自然科学基金

0+阅读 · 2014年12月31日

微信扫码咨询专知VIP会员