Panza: Design and Analysis of a Fully-Local Personalized Text Writing Assistant

The availability of powerful open-source large language models (LLMs) opens exciting use cases, such as automated personal assistants that adapt to the user's unique data and demands. Two key requirements for such assistants are personalization - in the sense that the assistant should reflect the user's own writing style - and privacy - users may prefer to always store their personal data locally, on their own computing device. In this application paper, we present a new design and evaluation for such an automated assistant, for the specific use case of email generation, which we call Panza. Specifically, Panza can be trained and deployed locally on commodity hardware, and is personalized to the user's writing style. Panza's personalization features are based on a combination of fine-tuning using a variant of the Reverse Instructions technique together with Retrieval-Augmented Generation (RAG). We demonstrate that this combination allows us to fine-tune an LLM to better reflect a user's writing style using limited data, while executing on extremely limited resources, e.g. on a free Google Colab instance. Our key methodological contribution is what we believe to be the first detailed study of evaluation metrics for this personalized writing task, and of how different choices of system components - e.g. the use of RAG and of different fine-tuning approaches - impact the system's performance. We are releasing the full Panza code as well as a new "David" personalized email dataset licensed for research use, both available on https://github.com/IST-DASLab/PanzaMail.

翻译：强大开源大型语言模型（LLM）的可用性开启了令人兴奋的应用场景，例如能够适应用户独特数据与需求的自动化个人助手。此类助手的两个关键需求在于个性化（即助手应反映用户自身的写作风格）与隐私性（用户可能更倾向于始终将个人数据本地存储于自有计算设备）。在本应用论文中，我们针对电子邮件生成这一具体应用场景，提出了一种新型自动化助手的设计与评估方案，并将其命名为Panza。具体而言，Panza可在商用硬件上实现本地训练与部署，并能适应用户的写作风格进行个性化定制。Panza的个性化功能基于反向指令变体技术与检索增强生成（RAG）相结合的微调方法实现。我们证明，该组合方案能够在有限数据条件下微调LLM以更好地反映用户写作风格，同时可在极度受限的资源环境中运行（例如免费的Google Colab实例）。我们的核心方法论贡献在于首次系统研究了该个性化写作任务的评估指标体系，以及不同系统组件选择（例如RAG的运用与不同微调方法）对系统性能的影响。我们已发布完整的Panza代码及经研究使用许可的新型"David"个性化电子邮件数据集，两者均可通过https://github.com/IST-DASLab/PanzaMail获取。