We provide an evaluation of an analytical workload in a confidential computing environment, combining DuckDB with two technologies: modular columnar encryption in Parquet files (data at rest) and the newest version of the Intel SGX Trusted Execution Environment (TEE), providing a hardware enclave where data in flight can be (more) securely decrypted and processed. One finding is that the "performance tax" for such confidential analytical processing is acceptable compared to not using these technologies. We eventually manage to run TPC-H SF30 with under 2x overhead compared to non-encrypted, non-enclave execution; we show that, specifically, columnar compression and encryption are a good combination. Our second finding consists of dos and don'ts to tune DuckDB to work effectively in this environment. There are various performance hazards: potentially 5x higher cache miss costs due to memory encryption inside the enclave, NUMA penalties, and highly elevated cost of swapping pages in and out of the enclave -- which is also triggered indirectly by using a non-SGX-aware malloc library.
翻译:我们针对机密计算环境下的分析型工作负载进行了评估,将DuckDB与两项技术相结合:Parquet文件中的模块化列级加密(静态数据)以及最新版英特尔SGX可信执行环境(TEE),后者提供硬件飞地,可(更)安全地解密和处理传输中的数据。一项发现是:相较于不使用这些技术,此类机密分析处理的"性能税"是可以接受的。我们最终成功运行了TPC-H SF30基准测试,其开销仅为非加密、非飞地执行环境的2倍以内;研究表明,列式压缩与加密技术形成了良好组合。第二项发现涉及在机密环境中有效调优DuckDB的可行与不可行操作。存在多种性能风险:因飞地内内存加密导致的缓存未命中代价可能升高5倍、NUMA惩罚机制、以及飞地内外页交换成本大幅上升——这些风险也会因使用非SGX感知的malloc库而间接触发。