Organizations handling sensitive documents face a tension: cloud-based AI risks GDPR violations, while local systems typically require 18-32 GB RAM. This paper presents CUBO, a systems-oriented RAG platform for consumer laptops with 16 GB shared memory. CUBO's novelty lies in engineering integration of streaming ingestion (O(1) buffer overhead), tiered hybrid retrieval, and hardware-aware orchestration that enables competitive Recall@10 (0.48-0.97 across BEIR domains) within a hard 15.5 GB RAM ceiling. The 37,000-line codebase achieves retrieval latencies of 185 ms (p50) on C1,300 laptops while maintaining data minimization through local-only processing aligned with GDPR Art. 5(1)(c). Evaluation on BEIR benchmarks validates practical deployability for small-to-medium professional archives. The codebase is publicly available at https://github.com/PaoloAstrino/CUBO.
翻译:处理敏感文档的组织面临一个两难困境:基于云的人工智能存在违反GDPR的风险,而本地系统通常需要18-32 GB内存。本文提出了CUBO,一个面向系统的检索增强生成平台,专为仅配备16 GB共享内存的消费级笔记本电脑设计。CUBO的创新之处在于工程上的集成:它结合了流式数据摄取(O(1)缓冲区开销)、分层混合检索以及硬件感知的编排机制,从而在严格的15.5 GB内存上限内实现了具有竞争力的Recall@10性能(在BEIR各领域达到0.48-0.97)。这个包含37,000行代码的代码库在C1,300系列笔记本电脑上实现了185毫秒的检索延迟(p50),同时通过仅在本地进行处理,符合GDPR第5条第1款c项的数据最小化原则。在BEIR基准测试上的评估验证了其对于中小型专业档案库的实际可部署性。代码库已在https://github.com/PaoloAstrino/CUBO 公开。