We present HAIFAI - a novel two-stage system where humans and AI interact to tackle the challenging task of reconstructing a visual representation of a face that exists only in a person's mind. In the first stage, users iteratively rank images our reconstruction system presents based on their resemblance to a mental image. These rankings, in turn, allow the system to extract relevant image features, fuse them into a unified feature vector, and use a generative model to produce an initial reconstruction of the mental image. The second stage leverages an existing face editing method, allowing users to manually refine and further improve this reconstruction using an easy-to-use slider interface for face shape manipulation. To avoid the need for tedious human data collection for training the reconstruction system, we introduce a computational user model of human ranking behaviour. For this, we collected a small face ranking dataset through an online crowd-sourcing study containing data from 275 participants. We evaluate HAIFAI and an ablated version in a 12-participant user study and demonstrate that our approach outperforms the previous state of the art regarding reconstruction quality, usability, perceived workload, and reconstruction speed. We further validate the reconstructions in a subsequent face ranking study with 18 participants and show that HAIFAI achieves a new state-of-the-art identification rate of 60.6%. These findings represent a significant advancement towards developing new interactive intelligent systems capable of reliably and effortlessly reconstructing a user's mental image.
翻译:我们提出HAIFAI——一种新颖的两阶段系统,通过人类与人工智能的交互来解决重建仅存在于人脑中的面部视觉表征这一挑战性任务。在第一阶段,用户根据图像与心理图像的相似度,迭代地对重建系统呈现的图像进行排序。这些排序使系统能够提取相关图像特征,将其融合为统一特征向量,并利用生成模型生成心理图像的初始重建结果。第二阶段借助现有的人脸编辑方法,允许用户通过易于使用的滑块界面进行面部形状调整,从而手动优化并进一步提升重建质量。为避免为训练重建系统而进行繁琐的人工数据收集,我们引入了模拟人类排序行为的计算用户模型。为此,我们通过在线众包研究收集了包含275名参与者数据的小型人脸排序数据集。我们通过一项12名参与者的用户研究评估了HAIFAI及其简化版本,结果表明我们的方法在重建质量、可用性、感知工作负荷和重建速度方面均优于现有最优技术。我们在后续包含18名参与者的人脸排序研究中进一步验证了重建效果,证明HAIFAI达到了60.6%的最新最优识别率。这些发现标志着我们在开发能够可靠且轻松重建用户心理图像的新型交互式智能系统方面取得了重要进展。