Full Line Code Completion: Bringing AI to Desktop

Anton Semenkin,Vitaliy Bibaev,Yaroslav Sokolov,Kirill Krylov,Alexey Kalina,Anna Khannanova,Danila Savenkov,Darya Rovdo,Igor Davidenko,Kirill Karnaukhov,Maxim Vakhrushev,Mikhail Kostyukov,Mikhail Podvitskii,Petr Surkov,Yaroslav Golubev,Nikita Povarov,Timofey Bryksin

from arxiv, 12 pages, 4 figures

In recent years, several industrial solutions for the problem of multi-token code completion have appeared, each making a great advance in the area but mostly focusing on cloud-based runtime and avoiding working on the end user's device. In this work, we describe our approach for building a multi-token code completion feature for the JetBrains' IntelliJ Platform, which we call Full Line Code Completion. The feature suggests only syntactically correct code and works fully locally, i.e., data querying and the generation of suggestions happens on the end user's machine. We share important time and memory-consumption restrictions, as well as design principles that a code completion engine should satisfy. Working entirely on the end user's device, our code completion engine enriches user experience while being not only fast and compact but also secure. We share a number of useful techniques to meet the stated development constraints and also describe offline and online evaluation pipelines that allowed us to make better decisions. Our online evaluation shows that the usage of the tool leads to 1.5 times more code in the IDE being produced by code completion. The described solution was initially started with the help of researchers and was bundled into two JetBrains' IDEs - PyCharm Pro and DataSpell - at the end of 2023, so we believe that this work is useful for bridging academia and industry, providing researchers with the knowledge of what happens when complex research-based solutions are integrated into real products.

翻译：近年来，针对多标记代码补全问题出现了多种工业解决方案，每项都在该领域取得了重大进展，但大多侧重于基于云端的运行环境，避免在终端用户设备上运行。本文介绍了我们在JetBrains IntelliJ平台上构建多标记代码补全功能的方法，该功能被称为全行代码补全。此功能仅建议语法正确的代码，并完全在本地运行，即数据查询和生成建议均在用户本地设备上完成。我们阐述了重要的时间和内存消耗限制，以及代码补全引擎应满足的设计原则。完全在用户设备上运行的代码补全引擎不仅快速紧凑、安全可靠，更能丰富用户体验。我们分享了一系列满足既定开发约束的有效技术，并描述了支持决策优化的离线与在线评估流程。在线评估显示，该工具的使用使IDE中由代码补全生成的代码量提升了1.5倍。该解决方案最初由研究人员协助启动，并于2023年底集成到两款JetBrains IDE（PyCharm Pro和DataSpell）中。因此，我们认为这项工作有助于连接学术界与产业界，为研究人员提供复杂研究型解决方案集成到实际产品中的实践经验知识。