This paper presents two methodological advancements in decentralized multi-task learning under privacy constraints, aiming to pave the way for future developments in next-generation Blockchain platforms. First, we expand the existing framework for collaborative dictionary learning (CollabDict), which has previously been limited to Gaussian mixture models, by incorporating deep variational autoencoders (VAEs) into the framework, with a particular focus on anomaly detection. We demonstrate that the VAE-based anomaly score function shares the same mathematical structure as the non-deep model, and provide comprehensive qualitative comparison. Second, considering the widespread use of "pre-trained models," we provide a mathematical analysis on data privacy leakage when models trained with CollabDict are shared externally. We show that the CollabDict approach, when applied to Gaussian mixtures, adheres to a Renyi differential privacy criterion. Additionally, we propose a practical metric for monitoring internal privacy breaches during the learning process.
翻译:本文在隐私约束下的去中心化多任务学习领域提出了两项方法论进展,旨在为下一代区块链平台的发展铺平道路。首先,我们将此前局限于高斯混合模型的协同字典学习(CollabDict)框架扩展,通过引入深度变分自编码器(VAE),特别聚焦于异常检测任务。我们证明基于VAE的异常评分函数与非深度模型具有相同的数学结构,并提供了全面的定性比较。其次,针对“预训练模型”的广泛使用,我们对外部共享经CollabDict训练的模型时产生的数据隐私泄露进行了数学分析。结果表明,应用于高斯混合模型的CollabDict方法满足Rényi差分隐私准则。此外,我们提出了一种实用指标,用于在学习过程中监测内部隐私违规行为。