PrivGemo: Privacy-Preserving Dual-Tower Graph Retrieval for Empowering LLM Reasoning with Memory Augmentation

Knowledge graphs (KGs) provide structured evidence that can ground large language model (LLM) reasoning for knowledge-intensive question answering. However, many practical KGs are private, and sending retrieved triples or exploration traces to closed-source LLM APIs introduces leakage risk. Existing privacy treatments focus on masking entity names, but they still face four limitations: structural leakage under semantic masking, uncontrollable remote interaction, fragile multi-hop and multi-entity reasoning, and limited experience reuse for stability and efficiency. To address these issues, we propose PrivGemo, a privacy-preserving retrieval-augmented framework for KG-grounded reasoning with memory-guided exposure control. PrivGemo uses a dual-tower design to keep raw KG knowledge local while enabling remote reasoning over an anonymized view that goes beyond name masking to limit both semantic and structural exposure. PrivGemo supports multi-hop, multi-entity reasoning by retrieving anonymized long-hop paths that connect all topic entities, while keeping grounding and verification on the local KG. A hierarchical controller and a privacy-aware experience memory further reduce unnecessary exploration and remote interactions. Comprehensive experiments on six benchmarks show that PrivGemo achieves overall state-of-the-art results, outperforming the strongest baseline by up to 17.1%. Furthermore, PrivGemo enables smaller models (e.g., Qwen3-4B) to achieve reasoning performance comparable to that of GPT-4-Turbo.

翻译：知识图谱（KG）为知识密集型问答任务提供了结构化证据，能够支撑大语言模型（LLM）的推理过程。然而，许多实际应用中的知识图谱属于私有数据，将检索到的三元组或探索轨迹发送至闭源LLM API会带来泄露风险。现有的隐私处理方法主要集中于掩蔽实体名称，但仍面临四个局限：语义掩蔽下的结构泄露、不可控的远程交互、脆弱的多跳与多实体推理，以及稳定性与效率方面的有限经验复用。为解决这些问题，我们提出PrivGemo，一个用于知识图谱支撑推理的隐私保护检索增强框架，具备记忆引导的暴露控制机制。PrivGemo采用双塔设计，将原始知识图谱数据保留在本地，同时支持在匿名化视图上进行远程推理；该视图不仅进行名称掩蔽，更能同时限制语义与结构暴露。PrivGemo通过检索连接所有主题实体的匿名化长跳路径来支持多跳、多实体推理，同时保持基于本地知识图谱的验证与追溯。分层控制器与隐私感知经验记忆库进一步减少了不必要的探索和远程交互。在六个基准数据集上的综合实验表明，PrivGemo取得了全面的最优性能，相比最强基线最高提升17.1%。此外，PrivGemo使较小模型（如Qwen3-4B）能够达到与GPT-4-Turbo相当的推理性能。