Eir: Thai Medical Large Language Models

We present Eir-8B, a large language model with 8 billion parameters, specifically designed to enhance the accuracy of handling medical tasks in the Thai language. This model focuses on providing clear and easy-to-understand answers for both healthcare professionals and patients, thereby improving the efficiency of diagnosis and treatment processes. Human evaluation was conducted to ensure that the model adheres to care standards and provides unbiased answers. To prioritize data security, the model is deployed within the hospital's internal network, ensuring both high security and faster processing speeds. The internal API connection is secured with encryption and strict authentication measures to prevent data leaks and unauthorized access. We evaluated several open-source large language models with 8 billion parameters on four medical benchmarks: MedQA, MedMCQA, PubMedQA, and the medical subset of MMLU. The best-performing baselines were used to develop Eir-8B. Our evaluation employed multiple questioning strategies, including zero-shot, few-shot, chain-of-thought reasoning, and ensemble/self-consistency voting methods. Our model outperformed commercially available Thai-language large language models by more than 10%. In addition, we developed enhanced model testing tailored for clinical use in Thai across 18 clinical tasks, where our model exceeded GPT-4o performance by more than 11%.

翻译：我们推出Eir-8B，这是一个拥有80亿参数的大语言模型，专门设计用于提升泰语医学任务处理的准确性。该模型致力于为医疗专业人员和患者提供清晰易懂的解答，从而提高诊疗流程的效率。我们通过人工评估确保模型遵循医疗护理标准并提供无偏见的回答。为优先保障数据安全，该模型部署于医院内部网络，既确保了高安全性，又实现了更快的处理速度。内部API连接采用加密技术和严格的身份验证措施，以防止数据泄露和未经授权的访问。我们在四个医学基准测试（MedQA、MedMCQA、PubMedQA以及MMLU的医学子集）上评估了多个80亿参数的开源大语言模型，并基于表现最佳的基线模型开发了Eir-8B。我们的评估采用了多种提问策略，包括零样本学习、少样本学习、思维链推理以及集成/自洽投票方法。实验表明，我们的模型在性能上超越市售泰语大语言模型超过10%。此外，我们针对泰语临床应用的18项临床任务开发了增强型模型测试，在该测试中我们的模型性能超过GPT-4o达11%以上。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日