The digital economy is powered by a continuous and massive exchange of personal data. Individuals provide data to platforms in return for services, from social networking and search to health monitoring, entertainment, and access to LLMs. This exchange has created immense value, but it has also established a fundamental asymmetry of power: individuals possess only coarse-grained control over data access rather than fine-grained control over its purpose of use, creating a gap where data can be repurposed for undisclosed uses, e.g., platforms selling the data to data brokers, which results in a critical loss of personal data sovereignty. This paper reframes this socio-technical challenge as a dataflow management problem. We propose a bolt-on data escrow architecture through delegated computation. In our model, instead of data flowing to platforms, platforms delegate their computation to a trustworthy escrow. This inversion empowers individuals with transparency and control over their dataflows. We present four contributions: (1) a dataflow model that explicitly incorporates computational purpose as a first-class primitive; (2) a minimally invasive programming interface, run(access(), compute()), built on a unified relational interface that virtualizes on-device data sources and a computation offloading component; (3) a concrete implementation of our escrow within the Apple ecosystem, demonstrating its practicality; and (4) both qualitative and quantitative evaluations demonstrating that our solution is expressive enough to implement a wide range of dataflows from real-world applications and introduces minimal runtime overhead. In summary, our work serves as a stepping stone toward achieving personal dataflow sovereignty.
翻译:数字经济由持续且大规模的个人数据交换驱动。个体向平台提供数据以换取服务,涵盖社交网络、搜索、健康监测、娱乐以及大型语言模型(LLM)的访问。这种交换创造了巨大价值,但也导致了根本性的权力不对称:个体仅能对数据访问进行粗粒度控制,而无法对其使用目的进行细粒度管控,从而形成数据可能被重新用于未公开用途的漏洞(例如平台将数据出售给数据中介),造成个人数据主权的严重丧失。本文将这一社会技术挑战重新定义为数据流管理问题。我们提出一种基于委托计算的可插拔数据托管架构。在我们的模型中,数据不再流向平台,而是平台将其计算任务委托给可信的托管方。这种模式转换赋予个体对其数据流的透明度和控制权。我们提出四项贡献:(1)将计算目的作为一等原语显式纳入的数据流模型;(2)基于统一关系接口构建的最小侵入式编程接口 run(access(), compute()),该接口虚拟化设备端数据源并集成计算卸载组件;(3)在苹果生态系统内实现的具体托管方案,证明其可行性;(4)定性与定量评估表明,我们的解决方案足以表达各类现实应用中的数据流,且仅引入极低的运行时开销。总之,本研究为实现个人数据流主权奠定了基石。