Adapting language models to the clinical domain through continued pretraining and fine-tuning requires costly retraining for each new model generation. We propose Cross-Architecture Proxy Tuning (CAPT), a model-ensembling approach that enables training-free adaptation of state-of-the-art general-domain models using existing clinical models. CAPT supports models with disjoint vocabularies, leveraging contrastive decoding to selectively inject clinically relevant signals while preserving the general-domain model's reasoning and fluency. On six clinical classification and text-generation tasks, CAPT with a new-generation general-domain model and an older-generation clinical model consistently outperforms both models individually and state-of-the-art ensembling approaches (average +17.6% over UniTE, +41.4% over proxy tuning across tasks). Through token-level analysis and physician case studies, we demonstrate that CAPT amplifies clinically actionable language, reduces context errors, and increases clinical specificity.
翻译:通过持续预训练和微调使语言模型适应临床领域,需要为每个新模型版本进行昂贵的重新训练。本文提出跨架构代理调优方法,这是一种模型集成策略,能够利用现有临床模型实现最先进通用领域模型的免训练适配。该方法支持具有非重叠词表的模型,利用对比解码选择性注入临床相关信号,同时保持通用领域模型的推理能力和流畅性。在六项临床分类与文本生成任务中,采用新一代通用领域模型与旧一代临床模型的跨架构代理调优方法,其表现始终优于单个模型及最先进的集成方法(平均超越UniTE方法17.6%,跨任务超越代理调优方法41.4%)。通过词元级分析和医师案例研究,我们证明该方法能增强临床可操作性语言表达,减少上下文错误,并提升临床特异性。