We demonstrate a proof-of-concept of a large language model conducting corporate lobbying related activities. An autoregressive large language model (OpenAI's text-davinci-003) determines if proposed U.S. Congressional bills are relevant to specific public companies and provides explanations and confidence levels. For the bills the model deems as relevant, the model drafts a letter to the sponsor of the bill in an attempt to persuade the congressperson to make changes to the proposed legislation. We use hundreds of novel ground-truth labels of the relevance of a bill to a company to benchmark the performance of the model. It outperforms the baseline of predicting the most common outcome of irrelevance. We also benchmark the performance of the previous OpenAI GPT-3 model (text-davinci-002), which was the state-of-the-art model on many academic natural language tasks until text-davinci-003 was recently released. The performance of text-davinci-002 is worse than the simple baseline. Longer-term, if AI begins to influence law in a manner that is not a direct extension of human intentions, this threatens the critical role that law as information could play in aligning AI with humans. Initially, AI is being used to simply augment human lobbyists for a small portion of their daily tasks. However, firms have an incentive to use less and less human oversight over automated assessments of policy ideas and the written communication to regulatory agencies and Congressional staffers. The core question raised is where to draw the line between human-driven and AI-driven policy influence.
翻译:我们展示了一项概念验证,证明大型语言模型能够执行与公司游说相关的活动。一个自回归大型语言模型(OpenAI的text-davinci-003)用于判定拟议的美国国会法案是否与特定上市公司相关,并提供解释和置信水平。对于模型认为相关的法案,该模型会起草一封致法案发起人的信函,试图说服国会议员对拟议立法进行修改。我们采用数百个关于法案与公司相关性的新颖真实标签来评估模型性能,其表现优于预测“不相关”这一最常见结果的基线模型。我们还对比了先前OpenAI GPT-3模型(text-davinci-002)的性能,该模型在text-davinci-003发布前曾是多项学术自然语言处理任务的最优模型。text-davinci-002的表现甚至不及简单的基线模型。从长远来看,如果人工智能开始以并非人类意图直接延伸的方式影响法律,这将对法律作为信息在人工智能与人类对齐中的关键作用构成威胁。初期,人工智能仅被用于简单增强人类说客的日常任务中的一小部分。然而,企业有动机在政策观点的自动评估以及与监管机构和国会工作人员的书面沟通中,逐步减少人类监督。由此引发的核心问题是:如何在人类驱动与人工智能驱动的政策影响力之间划定界限。