Clinical Term Extraction using Open-Source Small Language Models - 专知论文

会员服务 ·

0

MoDELS · 语言模型化 · 基准 · Prompt · 微F1 ·

Clinical Term Extraction using Open-Source Small Language Models

翻译：暂无翻译

Noah Marchal,William E. Janes,Mihail Popescu,Xing Song

Clinical information for amyotrophic lateral sclerosis (ALS) care documented in unstructured clinical notes limits downstream analysis without extraction into structured formats. Open-source small language models with few-shot prompting for detecting the presence of ALS-relevant clinical terms in patient documentation were evaluated without task-specific training data. The detection task targeted 17 categories spanning functional scores, respiratory measures, medications, and related clinical and non-clinical attributes. Clinical note content was normalized from JSON-encoded discharge summaries and processed with a prompt template having structured JSON outputs. We compared 26 open-source models using aggregate, label-level, and manual-validation multilabel classification metrics. Manual validation showed that a regex rule baseline had higher overall micro-F1 and lower Hamming loss than any single SLM or TF-IDF baseline, while Qwen3-4B-Instruct-2507 was the highest-performing SLM by micro-F1. Model rankings varied by metric and label category, with the TF-IDF baseline showing high recall but low precision, some SLMs showing higher precision but lower recall, and Hammer2.1-7b showing strong performance for ALSFRS-R subscore detection. These findings support targeted hybrid extraction workflows rather than replacement of existing rule-based methods.

翻译：暂无翻译

0

相关内容

MoDELS

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

Cancer Cell综述｜AI用于肿瘤学中的多模态数据集成

Cancer Cell综述｜AI用于肿瘤学中的多模态数据集成

专知会员服务

35+阅读 · 2022年10月13日

Nature Medicine | AI与临床相结合，最新DECIDE-AI指南助力临床人工智能从开发到实施

Nature Medicine | AI与临床相结合，最新DECIDE-AI指南助力临床人工智能从开发到实施

专知会员服务

30+阅读 · 2022年5月22日

【CVPR 2022】跨模态检索的协同双流视觉-语言前训练模型，COTS: Collaborative Two-Stream Vision-Language Pre-Training Model for Cross-Modal Retrieval

【CVPR 2022】跨模态检索的协同双流视觉-语言前训练模型，COTS: Collaborative Two-Stream Vision-Language Pre-Training Model for Cross-Modal Retrieval

专知会员服务

13+阅读 · 2022年3月12日

【CVPR 2022】基于元内存传输的跨域少镜头语义分割，Remember the Difference: Cross-Domain Few-Shot Semantic Segmentation via Meta-Memory Transfer

【CVPR 2022】基于元内存传输的跨域少镜头语义分割，Remember the Difference: Cross-Domain Few-Shot Semantic Segmentation via Meta-Memory Transfer

专知会员服务

13+阅读 · 2022年3月12日

【斯坦福】从电子病历EHR构建知识图谱，Robustly Extracting Medical Knowledge from EHRs:A Case Study of Learning a Health Knowledge Graph

【斯坦福】从电子病历EHR构建知识图谱，Robustly Extracting Medical Knowledge from EHRs:A Case Study of Learning a Health Knowledge Graph

专知会员服务

56+阅读 · 2020年6月2日

【2020关键词提取】医学报告的关键词提取和结构化，Keyword extraction and structuralization of medical reports

【2020关键词提取】医学报告的关键词提取和结构化，Keyword extraction and structuralization of medical reports

专知会员服务

33+阅读 · 2020年5月2日

【论文推荐】用于低资源药物发现的元学习初始化，Meta-Learning Initializations for Low-Resource Drug Discovery

【论文推荐】用于低资源药物发现的元学习初始化，Meta-Learning Initializations for Low-Resource Drug Discovery

专知会员服务

27+阅读 · 2020年3月26日

【AAAI2020论文】概念结构化嵌入医疗文本表示（Learning Conceptual-Contextual Embeddings for Medical Text）

【AAAI2020论文】概念结构化嵌入医疗文本表示（Learning Conceptual-Contextual Embeddings for Medical Text）

专知会员服务

50+阅读 · 2019年11月15日

【CoRL2019最佳论文】模仿学习，A Divergence Minimization Perspective on Imitation Learning Methods

【CoRL2019最佳论文】模仿学习，A Divergence Minimization Perspective on Imitation Learning Methods

专知会员服务

24+阅读 · 2019年11月11日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

ICCV 2019 最佳论文《SinGAN：从单张自然图像学习生成式模型》中文全译

ICCV 2019 最佳论文《SinGAN：从单张自然图像学习生成式模型》中文全译

AI科技评论

11+阅读 · 2019年10月30日

【论文】Awesome Relation Extraction Paper（关系抽取）（PART IV）

【论文】Awesome Relation Extraction Paper（关系抽取）（PART IV）

AINLP

15+阅读 · 2019年8月26日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

文本分类又来了，用 Scikit-Learn 解决多类文本分类问题

文本分类又来了，用 Scikit-Learn 解决多类文本分类问题

AI研习社

14+阅读 · 2018年7月22日

Word2Vec —— 深度学习的一小步，自然语言处理的一大步

Word2Vec —— 深度学习的一小步，自然语言处理的一大步

AI研习社

21+阅读 · 2018年6月14日

STRCF for Visual Object Tracking

STRCF for Visual Object Tracking

统计学习与视觉计算组

15+阅读 · 2018年5月29日

论文浅尝 | Open world Knowledge Graph Completion

论文浅尝 | Open world Knowledge Graph Completion

开放知识图谱

19+阅读 · 2018年1月30日

论文浅尝 | Improved Neural Relation Detection for KBQA

论文浅尝 | Improved Neural Relation Detection for KBQA

开放知识图谱

13+阅读 · 2018年1月21日

面向跨领域异构数据的患者相似性学习方法及应用

国家自然科学基金

23+阅读 · 2016年12月31日

基于微透析在线分析技术和代谢组学方法的人参远志配伍治疗阿尔茨海默病的药效物质基础和作用机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

人口老龄化背景下阿尔茨海默病患者卫生服务利用与经济保护模型研究

国家自然科学基金

0+阅读 · 2015年12月31日

基于自噬系统mTOR信号通路探讨扶正祛邪中药小复方干预阿尔茨海默病模型的机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

基于mTOR和Mnk1双靶点策略的苦参等5种中药抑制翻译起始活性成分研究

国家自然科学基金

0+阅读 · 2015年12月31日

林麝化脓性疾病与细菌识别相关TLR基因表达及多态性的关系研究

国家自然科学基金

0+阅读 · 2015年12月31日

基于纳米胶束的肿瘤治疗和成像一体化的多功能药物传递系统的构建与评价

国家自然科学基金

0+阅读 · 2015年12月31日

放射诱导内皮细胞源Microvesicles通过TGF-β/Smad信号通路介导口腔癌放疗抗性的研究

国家自然科学基金

0+阅读 · 2015年12月31日

外源性CO通过GC/cGMP/PKG信号通路调控脓毒症时EC/SMC互调作用的机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

骨髓基质神经干细胞联合法舒地尔多视角治疗AD的探索

国家自然科学基金

0+阅读 · 2014年12月31日

LLM-MINE: Large Language Model based Alzheimer's Disease and Related Dementias Phenotypes Mining from Clinical Notes

Arxiv

0+阅读 · 6月22日

Unlocking In-Context Learning in Audio-Language Models from Decentralized Medical Audio

Arxiv

0+阅读 · 6月22日

Prompt, Plan, Extract: Zero-Shot Agentic LLMs Workflows for Lung Pathology Extraction from Clinical Narratives

Arxiv

0+阅读 · 6月18日

Configurable Clinical Information Extraction with Agentic RAG: What Works, What Breaks, and Why

Arxiv

0+阅读 · 6月17日

Assessing covariate-adjusted risk differences in small-sample clinical trials

Arxiv

0+阅读 · 6月16日

MedicalAgentsBench for Complex Medical Reasoning: Comparing Internalized Reasoning Models versus Externalized Agent-based Frameworks

Arxiv

0+阅读 · 6月16日

Feature Decomposition and Reconstruction Learning for Effective Facial Expression Recognition

Arxiv

15+阅读 · 2021年4月12日

Graph-Evolving Meta-Learning for Low-Resource Medical Dialogue Generation

Arxiv

20+阅读 · 2020年12月22日

Semi-supervised Medical Image Segmentation through Dual-task Consistency

Arxiv

14+阅读 · 2020年9月9日

Consensus Based Medical Image Segmentation Using Semi-Supervised Learning And Graph Cuts

Arxiv

11+阅读 · 2018年5月21日

VIP会员

文章信息

相关主题

语言模型化

最新内容

无人机自主控制与人工智能：系统性综述

无人机自主控制与人工智能：系统性综述

专知会员服务

5+阅读 · 今天7:25

巡飞弹与反无人机系统——现代战场的两大支柱

巡飞弹与反无人机系统——现代战场的两大支柱

专知会员服务

2+阅读 · 今天6:54

《打造“黄金舰队”》57页报告

《打造“黄金舰队”》57页报告

专知会员服务

1+阅读 · 今天6:52

《北约数字教官网络发展路径》128页报告

《北约数字教官网络发展路径》128页报告

专知会员服务

1+阅读 · 今天6:33

ECCV 2026 | MIMFlow：MIM与归一化流统一图像生成

ECCV 2026 | MIMFlow：MIM与归一化流统一图像生成

专知会员服务

6+阅读 · 6月25日

超越自回归边界：扩散模型、世界模型与SSM如何重塑代码智能

超越自回归边界：扩散模型、世界模型与SSM如何重塑代码智能

专知会员服务

5+阅读 · 6月25日

重塑决策优势：美军作战艺术与多域作战中联盟联合全域指挥控制（CJADC2）体系的融合

重塑决策优势：美军作战艺术与多域作战中联盟联合全域指挥控制（CJADC2）体系的融合

专知会员服务

9+阅读 · 6月25日

网状网络及其在军事领域的运用

网状网络及其在军事领域的运用

专知会员服务

7+阅读 · 6月25日

《意识即战场——全球安全体系中认知战的演进：乌克兰构建认知作战体系的展望》

《意识即战场——全球安全体系中认知战的演进：乌克兰构建认知作战体系的展望》

专知会员服务

8+阅读 · 6月25日

无美国参与的欧洲战争方式（万字长文）

无美国参与的欧洲战争方式（万字长文）

专知会员服务

8+阅读 · 6月25日

重构“下一场战争”的制胜理论：超越兰彻斯特方程与现代系统

重构“下一场战争”的制胜理论：超越兰彻斯特方程与现代系统

专知会员服务

10+阅读 · 6月25日

《国防工业中基于模型定义的实施：产品定义数字化转型的战略路径》90页

《国防工业中基于模型定义的实施：产品定义数字化转型的战略路径》90页

专知会员服务

9+阅读 · 6月25日

《国防领域敏感性分析白皮书》

《国防领域敏感性分析白皮书》

专知会员服务

9+阅读 · 6月25日

综述 | 从问答到任务完成：Agent系统与Harness设计

综述 | 从问答到任务完成：Agent系统与Harness设计

专知会员服务

10+阅读 · 6月24日

Agentic RL：框架、实践与长程智能体训练

Agentic RL：框架、实践与长程智能体训练

专知会员服务

10+阅读 · 6月24日

相关VIP内容

Cancer Cell综述｜AI用于肿瘤学中的多模态数据集成

Cancer Cell综述｜AI用于肿瘤学中的多模态数据集成

专知会员服务

35+阅读 · 2022年10月13日

Nature Medicine | AI与临床相结合，最新DECIDE-AI指南助力临床人工智能从开发到实施

Nature Medicine | AI与临床相结合，最新DECIDE-AI指南助力临床人工智能从开发到实施

专知会员服务

30+阅读 · 2022年5月22日

【CVPR 2022】跨模态检索的协同双流视觉-语言前训练模型，COTS: Collaborative Two-Stream Vision-Language Pre-Training Model for Cross-Modal Retrieval

【CVPR 2022】跨模态检索的协同双流视觉-语言前训练模型，COTS: Collaborative Two-Stream Vision-Language Pre-Training Model for Cross-Modal Retrieval

专知会员服务

13+阅读 · 2022年3月12日

【CVPR 2022】基于元内存传输的跨域少镜头语义分割，Remember the Difference: Cross-Domain Few-Shot Semantic Segmentation via Meta-Memory Transfer

【CVPR 2022】基于元内存传输的跨域少镜头语义分割，Remember the Difference: Cross-Domain Few-Shot Semantic Segmentation via Meta-Memory Transfer

专知会员服务

13+阅读 · 2022年3月12日

【斯坦福】从电子病历EHR构建知识图谱，Robustly Extracting Medical Knowledge from EHRs:A Case Study of Learning a Health Knowledge Graph

【斯坦福】从电子病历EHR构建知识图谱，Robustly Extracting Medical Knowledge from EHRs:A Case Study of Learning a Health Knowledge Graph

专知会员服务

56+阅读 · 2020年6月2日

【2020关键词提取】医学报告的关键词提取和结构化，Keyword extraction and structuralization of medical reports

【2020关键词提取】医学报告的关键词提取和结构化，Keyword extraction and structuralization of medical reports

专知会员服务

33+阅读 · 2020年5月2日

【论文推荐】用于低资源药物发现的元学习初始化，Meta-Learning Initializations for Low-Resource Drug Discovery

【论文推荐】用于低资源药物发现的元学习初始化，Meta-Learning Initializations for Low-Resource Drug Discovery

专知会员服务

27+阅读 · 2020年3月26日

【AAAI2020论文】概念结构化嵌入医疗文本表示（Learning Conceptual-Contextual Embeddings for Medical Text）

【AAAI2020论文】概念结构化嵌入医疗文本表示（Learning Conceptual-Contextual Embeddings for Medical Text）

专知会员服务

50+阅读 · 2019年11月15日

【CoRL2019最佳论文】模仿学习，A Divergence Minimization Perspective on Imitation Learning Methods

【CoRL2019最佳论文】模仿学习，A Divergence Minimization Perspective on Imitation Learning Methods

专知会员服务

24+阅读 · 2019年11月11日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

热门VIP内容

开通专知VIP会员享更多权益服务

巡飞弹与反无人机系统——现代战场的两大支柱

《北约数字教官网络发展路径》128页报告

无人机自主控制与人工智能：系统性综述

《打造“黄金舰队”》57页报告

相关资讯

ICCV 2019 最佳论文《SinGAN：从单张自然图像学习生成式模型》中文全译

ICCV 2019 最佳论文《SinGAN：从单张自然图像学习生成式模型》中文全译

AI科技评论

11+阅读 · 2019年10月30日

【论文】Awesome Relation Extraction Paper（关系抽取）（PART IV）

【论文】Awesome Relation Extraction Paper（关系抽取）（PART IV）

AINLP

15+阅读 · 2019年8月26日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

文本分类又来了，用 Scikit-Learn 解决多类文本分类问题

文本分类又来了，用 Scikit-Learn 解决多类文本分类问题

AI研习社

14+阅读 · 2018年7月22日

Word2Vec —— 深度学习的一小步，自然语言处理的一大步

Word2Vec —— 深度学习的一小步，自然语言处理的一大步

AI研习社

21+阅读 · 2018年6月14日

STRCF for Visual Object Tracking

STRCF for Visual Object Tracking

统计学习与视觉计算组

15+阅读 · 2018年5月29日

论文浅尝 | Open world Knowledge Graph Completion

论文浅尝 | Open world Knowledge Graph Completion

开放知识图谱

19+阅读 · 2018年1月30日

论文浅尝 | Improved Neural Relation Detection for KBQA

论文浅尝 | Improved Neural Relation Detection for KBQA

开放知识图谱

13+阅读 · 2018年1月21日

相关论文

LLM-MINE: Large Language Model based Alzheimer's Disease and Related Dementias Phenotypes Mining from Clinical Notes

Arxiv

0+阅读 · 6月22日

Unlocking In-Context Learning in Audio-Language Models from Decentralized Medical Audio

Arxiv

0+阅读 · 6月22日

Prompt, Plan, Extract: Zero-Shot Agentic LLMs Workflows for Lung Pathology Extraction from Clinical Narratives

Arxiv

0+阅读 · 6月18日

Configurable Clinical Information Extraction with Agentic RAG: What Works, What Breaks, and Why

Arxiv

0+阅读 · 6月17日

Assessing covariate-adjusted risk differences in small-sample clinical trials

Arxiv

0+阅读 · 6月16日

MedicalAgentsBench for Complex Medical Reasoning: Comparing Internalized Reasoning Models versus Externalized Agent-based Frameworks

Arxiv

0+阅读 · 6月16日

Feature Decomposition and Reconstruction Learning for Effective Facial Expression Recognition

Arxiv

15+阅读 · 2021年4月12日

Graph-Evolving Meta-Learning for Low-Resource Medical Dialogue Generation

Arxiv

20+阅读 · 2020年12月22日

Semi-supervised Medical Image Segmentation through Dual-task Consistency

Arxiv

14+阅读 · 2020年9月9日

Consensus Based Medical Image Segmentation Using Semi-Supervised Learning And Graph Cuts

Arxiv

11+阅读 · 2018年5月21日

相关基金

面向跨领域异构数据的患者相似性学习方法及应用

国家自然科学基金

23+阅读 · 2016年12月31日

基于微透析在线分析技术和代谢组学方法的人参远志配伍治疗阿尔茨海默病的药效物质基础和作用机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

人口老龄化背景下阿尔茨海默病患者卫生服务利用与经济保护模型研究

国家自然科学基金

0+阅读 · 2015年12月31日

基于自噬系统mTOR信号通路探讨扶正祛邪中药小复方干预阿尔茨海默病模型的机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

基于mTOR和Mnk1双靶点策略的苦参等5种中药抑制翻译起始活性成分研究

国家自然科学基金

0+阅读 · 2015年12月31日

林麝化脓性疾病与细菌识别相关TLR基因表达及多态性的关系研究

国家自然科学基金

0+阅读 · 2015年12月31日

基于纳米胶束的肿瘤治疗和成像一体化的多功能药物传递系统的构建与评价

国家自然科学基金

0+阅读 · 2015年12月31日

放射诱导内皮细胞源Microvesicles通过TGF-β/Smad信号通路介导口腔癌放疗抗性的研究

国家自然科学基金

0+阅读 · 2015年12月31日

外源性CO通过GC/cGMP/PKG信号通路调控脓毒症时EC/SMC互调作用的机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

骨髓基质神经干细胞联合法舒地尔多视角治疗AD的探索

国家自然科学基金

0+阅读 · 2014年12月31日

微信扫码咨询专知VIP会员