TAAC: A gate into Trustable Audio Affective Computing

With the emergence of AI techniques for depression diagnosis, the conflict between high demand and limited supply for depression screening has been significantly alleviated. Among various modal data, audio-based depression diagnosis has received increasing attention from both academia and industry since audio is the most common carrier of emotion transmission. Unfortunately, audio data also contains User-sensitive Identity Information (ID), which is extremely vulnerable and may be maliciously used during the smart diagnosis process. Among previous methods, the clarification between depression features and sensitive features has always serve as a barrier. It is also critical to the problem for introducing a safe encryption methodology that only encrypts the sensitive features and a powerful classifier that can correctly diagnose the depression. To track these challenges, by leveraging adversarial loss-based Subspace Decomposition, we propose a first practical framework \name presented for Trustable Audio Affective Computing, to perform automated depression detection through audio within a trustable environment. The key enablers of TAAC are Differentiating Features Subspace Decompositor (DFSD), Flexible Noise Encryptor (FNE) and Staged Training Paradigm, used for decomposition, ID encryption and performance enhancement, respectively. Extensive experiments with existing encryption methods demonstrate our framework's preeminent performance in depression detection, ID reservation and audio reconstruction. Meanwhile, the experiments across various setting demonstrates our model's stability under different encryption strengths. Thus proving our framework's excellence in Confidentiality, Accuracy, Traceability, and Adjustability.

翻译：随着人工智能技术在抑郁症诊断中的兴起，抑郁症筛查的高需求与有限供给之间的矛盾得到了显著缓解。在多种模态数据中，基于音频的抑郁症诊断因音频是情感传递的最常见载体而受到学术界和工业界的日益关注。然而，音频数据也包含用户敏感的身份信息（ID），这些信息极其脆弱，可能在智能诊断过程中被恶意利用。以往的方法中，抑郁症特征与敏感特征之间的区分始终是一道障碍。引入一种仅加密敏感特征的安全加密方法以及能够正确诊断抑郁症的强大分类器，对于解决该问题同样至关重要。为应对这些挑战，我们利用基于对抗损失的子空间分解，首次提出了一个实用框架TAAC（可信音频情感计算），用于在可信环境中通过音频实现自动化抑郁症检测。TAAC的关键使能技术包括差异化特征子空间分解器（DFSD）、柔性噪声加密器（FNE）和分阶段训练范式，分别用于特征分解、ID加密和性能提升。与现有加密方法的大量实验表明，我们的框架在抑郁症检测、ID保留和音频重建方面具有卓越性能。同时，不同设置下的实验证明了我们的模型在不同加密强度下的稳定性，从而验证了其在机密性、准确性、可追溯性和可调节性方面的卓越表现。