The era of large astronomical surveys generates massive image catalogs requiring efficient and secure access, particularly during pre-publication periods where data confidentiality and integrity are paramount. While Findable, Accessible, Interoperable, and Reusable (FAIR) principles guide the eventual public dissemination of data, traditional security methods for restricted phases often lack granularity or incur prohibitive performance penalties. To address this, we present a framework that integrates a flexible policy engine for fine-grained access control with a novel GPU-accelerated implementation of the AES-GCM authenticated encryption protocol. The novelty of this work lies in the adaptation and optimization of a parallel tree-reduction strategy to overcome the main performance bottleneck in authenticated encryption on GPUs: the inherently sequential Galois/Counter Mode (GCM) authentication hash (GHASH). We present both the algorithmic adaptation and its efficient execution on GPU architectures. Although similar parallelization techniques have been explored in cryptographic research, this is, to our knowledge, the first demonstration of their integration into a high-throughput encryption framework specifically designed for large-scale astronomical data. Our implementation transforms the sequential GHASH computation into a highly parallelizable, logarithmic-time process, achieving authenticated encryption throughput suitable for petabyte-scale image analysis. Our solution provides a robust mechanism for data providers to enforce access policies, ensuring both confidentiality and integrity without hindering research workflows, thereby facilitating a secure and managed transition of data to public, FAIR archives.
翻译:大型天文巡天时代产生了海量图像星表,需要高效且安全的访问,尤其是在数据保密性与完整性至关重要的预发布阶段。虽然可发现、可访问、可互操作、可重用(FAIR)原则指导着数据的最终公开传播,但用于受限阶段的传统安全方法往往缺乏细粒度控制,或导致难以承受的性能损失。为此,我们提出了一个框架,该框架将支持细粒度访问控制的灵活策略引擎与一种新颖的GPU加速AES-GCM认证加密协议实现相结合。本工作的创新之处在于,通过采用并优化一种并行树归约策略,克服了GPU上认证加密的主要性能瓶颈:即本质上串行的Galois/Counter Mode(GCM)认证哈希(GHASH)。我们介绍了该算法的适配方案及其在GPU架构上的高效执行。尽管密码学研究领域已探索过类似的并行化技术,但据我们所知,这是首次将其集成到一个专为大规模天文数据设计的高吞吐量加密框架中的演示。我们的实现将串行的GHASH计算转化为高度可并行化、对数时间复杂度的过程,实现了适用于PB级图像分析的认证加密吞吐量。我们的解决方案为数据提供者提供了一个强大的机制来执行访问策略,在确保数据保密性和完整性的同时,不阻碍研究工作流程,从而促进数据向公开、符合FAIR原则的档案库进行安全且受管理的过渡。