This paper presents a mapping strategy for interacting with the latent spaces of generative AI models. Our approach involves using unsupervised feature learning to encode a human control space and mapping it to an audio synthesis model's latent space. To demonstrate how this mapping strategy can turn high-dimensional sensor data into control mechanisms of a deep generative model, we present a proof-of-concept system that uses visual sketches to control an audio synthesis model. We draw on emerging discourses in XAIxArts to discuss how this approach can contribute to XAI in artistic and creative contexts, we also discuss its current limitations and propose future research directions.
翻译:本文提出了一种与生成式人工智能模型潜在空间交互的映射策略。我们的方法采用无监督特征学习来编码人类控制空间,并将其映射至音频合成模型的潜在空间。为展示该映射策略如何将高维传感器数据转化为深度生成模型的控制机制,我们构建了一个概念验证系统,该系统利用视觉草图来控制音频合成模型。我们借鉴XAIxArts领域的新兴论述,探讨了该方法如何能在艺术与创意情境中促进可解释人工智能的发展,同时也分析了其当前局限性并提出了未来的研究方向。