In decentralized personal data ecosystems grounded in architectures such as Solid, users retain sovereignty over their data via personal online data stores (pods), hosted on Solid-compliant server infrastructures. In such environments, data remains under the control of pod owners, which complicates search due to distribution across numerous pods and user-specific access constraints. ESPRESSO is a decentralized framework for scalable keyword-based search across distributed Solid pods under user-defined visibility policies. It addresses key challenges of decentralized search by constructing WebID-scoped indexes within pods and employing privacy-aware metadata to enable efficient source selection and ranking across servers. This paper further introduces a formal threat model for ESPRESSO, analysing the security and privacy risks associated with the generation, aggregation, and use of indexes and metadata. These risks include unintended metadata leakage and the potential for adversaries to infer sensitive information about data that resides within personal data stores. The analysis identifies key design principles that limit metadata exposure while mitigating unauthorized inference. The proposed threat model provides a foundation for evaluating privacy-preserving decentralized search and informs the design of systems with stronger privacy guarantees.
翻译:在基于Solid等架构的去中心化个人数据生态系统中,用户通过托管在符合Solid规范的服务器基础设施上的个人在线数据存储(pods)保留对其数据的主权。在此类环境中,数据始终由pod所有者控制,由于数据分布在众多pod中且受用户特定访问约束,这使得搜索变得复杂。ESPRESSO是一个去中心化框架,可在用户定义的可见性策略下实现跨分布式Solid pod的可扩展关键词搜索。它通过构建pod内WebID作用域索引并采用隐私感知元数据,实现跨服务器的高效源选择与排序,从而解决去中心化搜索的关键挑战。本文进一步提出了ESPRESSO的正式威胁模型,分析了与索引及元数据的生成、聚合和使用相关的安全与隐私风险。这些风险包括非预期的元数据泄露,以及对手从个人数据存储中推断敏感信息的可能性。分析确定了限制元数据暴露同时缓解未授权推断的关键设计原则。所提出的威胁模型为评估隐私保护型去中心化搜索提供了基础,并为设计具有更强隐私保障的系统提供了指导。