Laboratory-Scale AI: Open-Weight Models are Competitive with ChatGPT Even in Low-Resource Settings

The rapid proliferation of generative AI has raised questions about the competitiveness of lower-parameter, locally tunable, open-weight models relative to high-parameter, API-guarded, closed-weight models in terms of performance, domain adaptation, cost, and generalization. Centering under-resourced yet risk-intolerant settings in government, research, and healthcare, we see for-profit closed-weight models as incompatible with requirements for transparency, privacy, adaptability, and standards of evidence. Yet the performance penalty in using open-weight models, especially in low-data and low-resource settings, is unclear. We assess the feasibility of using smaller, open-weight models to replace GPT-4-Turbo in zero-shot, few-shot, and fine-tuned regimes, assuming access to only a single, low-cost GPU. We assess value-sensitive issues around bias, privacy, and abstention on three additional tasks relevant to those topics. We find that with relatively low effort, very low absolute monetary cost, and relatively little data for fine-tuning, small open-weight models can achieve competitive performance in domain-adapted tasks without sacrificing generality. We then run experiments considering practical issues in bias, privacy, and hallucination risk, finding that open models offer several benefits over closed models. We intend this work as a case study in understanding the opportunity cost of reproducibility and transparency over for-profit state-of-the-art zero shot performance, finding this cost to be marginal under realistic settings.

翻译：生成式人工智能的快速扩散引发了一个问题：在性能、领域适应、成本和泛化能力方面，参数较少、可本地调优的开放权重模型相对于参数庞大、API受保护、权重封闭的模型是否具有竞争力。聚焦于政府、研究和医疗保健领域中资源不足但风险承受能力低的环境，我们认为营利性的封闭权重模型与透明度、隐私性、适应性和证据标准的要求不相容。然而，使用开放权重模型所带来的性能损失，尤其是在低数据和低资源环境下，尚不明确。我们评估了使用较小的开放权重模型在零样本、少样本和微调机制下替代GPT-4-Turbo的可行性，假设仅能访问单个低成本GPU。我们还针对与偏见、隐私和弃权相关的三个额外任务，评估了涉及价值敏感性的问题。我们发现，以相对较低的努力、极低的绝对货币成本以及相对较少的微调数据，小型开放权重模型可以在不牺牲通用性的前提下，在领域适应任务中实现具有竞争力的性能。随后，我们进行了考虑偏见、隐私和幻觉风险等实际问题的实验，发现开放模型相比封闭模型具有若干优势。本研究旨在作为一个案例研究，以理解可复现性和透明度相对于营利性最先进零样本性能的机会成本，并发现该成本在实际环境下是微乎其微的。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

O’Reilly报告：知识图谱崛起——面向现代数据集成和数据结构体系，“The Rise of the Knowledge Graph——Toward Modern Data Integration and the Data Fabric Architecture”

专知会员服务

49+阅读 · 2022年2月18日

生成性对抗网络:理论模型、评估指标和最近发展的概述，Generative Adversarial Networks (GANs): An Overview of Theoretical Model, Evaluation Metrics, and Recent Developments

专知会员服务

42+阅读 · 2020年5月30日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日