Personalized virtual assistants powered by large language models (LLMs) on edge devices are attracting growing attention, with Retrieval-Augmented Generation (RAG) emerging as a key method for personalization by retrieving relevant profile data and generating tailored responses. However, deploying RAG on edge devices faces efficiency hurdles due to the rapid growth of profile data, such as user-LLM interactions and recent updates. While Computing-in-Memory (CiM) architectures mitigate this bottleneck by eliminating data movement between memory and processing units via in-situ operations, they are susceptible to environmental noise that can degrade retrieval precision. This poses a critical issue in dynamic, multi-domain edge-based scenarios (e.g., travel, medicine, and law) where both accuracy and adaptability are paramount. To address these challenges, we propose Task-Oriented Noise-resilient Embedding Learning (TONEL), a framework that improves noise robustness and domain adaptability for RAG in noisy edge environments. TONEL employs a noise-aware projection model to learn task-specific embeddings compatible with CiM hardware constraints, enabling accurate retrieval under noisy conditions. Extensive experiments conducted on personalization benchmarks demonstrate the effectiveness and practicality of our methods relative to strong baselines, especially in task-specific noisy scenarios.
翻译:基于边缘设备的大型语言模型(LLM)驱动的个性化虚拟助手正受到越来越多的关注,其中检索增强生成(RAG)通过检索相关配置文件数据并生成定制化响应,已成为实现个性化的关键技术。然而,由于配置文件数据(例如用户与LLM的交互记录及近期更新)的快速增长,在边缘设备上部署RAG面临效率瓶颈。虽然存内计算(CiM)架构通过原位操作消除了内存与处理单元之间的数据移动,从而缓解了这一瓶颈,但其易受环境噪声影响,可能导致检索精度下降。这在动态、多领域的边缘应用场景(如旅行、医疗和法律)中构成了关键问题,因为此类场景对准确性和适应性均有极高要求。为应对这些挑战,我们提出了面向任务的抗噪声嵌入学习(TONEL)框架,该框架旨在提升RAG在噪声边缘环境下的噪声鲁棒性和领域适应性。TONEL采用一种噪声感知投影模型来学习符合CiM硬件约束的任务特定嵌入,从而在噪声条件下实现精确检索。在个性化基准测试上进行的大量实验表明,相较于强基线方法,我们的方法在任务特定的噪声场景中尤其有效且实用。