Various data-sharing platforms have emerged with the growing public demand for open data and legislation mandating certain data to remain open. Most of these platforms remain opaque, leading to many questions about data accuracy, provenance and lineage, privacy implications, consent management, and the lack of fair incentives for data providers. With their transparency, immutability, non-repudiation, and decentralization properties, blockchains could not be more apt to answer these questions and enhance trust in a data-sharing platform. However, blockchains are not good at handling the four Vs of big data (i.e., volume, variety, velocity, and veracity) due to their limited performance, scalability, and high cost. Given many related works proposes blockchain-based trustworthy data-sharing solutions, there is increasing confusion and difficulties in understanding and selecting these technologies and platforms in terms of their sharing mechanisms, sharing services, quality of services, and applications. In this paper, we conduct a comprehensive survey on blockchain-based data-sharing architectures and applications to fill the gap. First, we present the foundations of blockchains and discuss the challenges of current data-sharing techniques. Second, we focus on the convergence of blockchain and data sharing to give a clear picture of this landscape and propose a reference architecture for blockchain-based data sharing. Third, we discuss the industrial applications of blockchain-based data sharing, ranging from healthcare and smart grid to transportation and decarbonization. For each application, we provide lessons learned for the deployment of Blockchain-based data sharing. Finally, we discuss research challenges and open research directions.
翻译:随着公众对开放数据的需求日益增长,以及相关立法强制要求部分数据保持开放,各类数据共享平台应运而生。然而,大多数此类平台仍存在不透明性问题,导致数据准确性、来源与谱系、隐私影响、同意管理以及缺乏对数据提供者的公平激励机制等方面产生诸多疑问。凭借其透明度、不可篡改性、不可否认性和去中心化特性,区块链技术恰好能够回应这些问题,并增强数据共享平台的可信度。然而,由于区块链存在性能有限、可扩展性不足以及成本高昂等局限,其在处理大数据的四V特征(即体量、多样性、速度和真实性)方面表现欠佳。鉴于现有许多相关工作提出了基于区块链的可信数据共享解决方案,在理解并选择这些技术与平台时,关于其共享机制、共享服务、服务质量及应用层面的困惑与难度日益增加。为填补这一空白,本文对基于区块链的数据共享架构与应用进行了全面综述。首先,我们阐述了区块链的基础知识,并探讨了当前数据共享技术面临的挑战。其次,聚焦于区块链与数据共享的融合,以清晰描绘该领域的全景,并提出一种基于区块链的数据共享参考架构。第三,我们讨论了基于区块链的数据共享在工业应用中的实践,涵盖医疗、智能电网、交通及脱碳等领域。针对每项应用,我们总结了部署基于区块链的数据共享所获得的经验教训。最后,我们探讨了相关研究挑战与开放式研究方向。