Retrieval-augmented text generation (RAG) addresses the common limitations of large language models (LLMs), such as hallucination, by retrieving information from an updatable external knowledge base. However, existing approaches often require dedicated backend servers for data storage and retrieval, thereby limiting their applicability in use cases that require strict data privacy, such as personal finance, education, and medicine. To address the pressing need for client-side dense retrieval, we introduce MeMemo, the first open-source JavaScript toolkit that adapts the state-of-the-art approximate nearest neighbor search technique HNSW to browser environments. Developed with modern and native Web technologies, such as IndexedDB and Web Workers, our toolkit leverages client-side hardware capabilities to enable researchers and developers to efficiently search through millions of high-dimensional vectors in the browser. MeMemo enables exciting new design and research opportunities, such as private and personalized content creation and interactive prototyping, as demonstrated in our example application RAG Playground. Reflecting on our work, we discuss the opportunities and challenges for on-device dense retrieval. MeMemo is available at https://github.com/poloclub/mememo.
翻译:检索增强文本生成(RAG)通过从可更新的外部知识库中检索信息,解决了大型语言模型(LLM)常见的幻觉等局限性。然而,现有方法通常需要专用的后端服务器进行数据存储与检索,从而限制了其在需要严格数据隐私的场景(如个人金融、教育和医疗)中的适用性。为满足客户端稠密检索的迫切需求,我们提出了MeMemo——首个将前沿的近似最近邻搜索技术HNSW适配至浏览器环境的开源JavaScript工具包。该工具包基于IndexedDB与Web Workers等现代原生Web技术开发,利用客户端硬件能力,使研究者和开发者能够在浏览器中高效搜索数百万个高维向量。如示例应用RAG Playground所示,MeMemo为隐私化与个性化内容创作、交互式原型设计等场景开辟了新的设计与研究机遇。基于本项工作,我们进一步探讨了设备端稠密检索面临的机遇与挑战。MeMemo已在https://github.com/poloclub/mememo 开源发布。