Environmental sounds like footsteps, keyboard typing, or dog barking carry rich information and emotional context, making them valuable for designing haptics in user applications. Existing audio-to-vibration methods, however, rely on signal-processing rules tuned for music or games and often fail to generalize across diverse sounds. To address this, we first investigated user perception of four existing audio-to-haptic algorithms, then created a data-driven model for environmental sounds. In Study 1, 34 participants rated vibrations generated by the four algorithms for 1,000 sounds, revealing no consistent algorithm preferences. Using this dataset, we trained Sound2Hap, a CNN-based autoencoder, to generate perceptually meaningful vibrations from diverse sounds with low latency. In Study 2, 15 participants rated its output higher than signal-processing baselines on both audio-vibration match and Haptic Experience Index (HXI), finding it more harmonious with diverse sounds. This work demonstrates a perceptually validated approach to audio-haptic translation, broadening the reach of sound-driven haptics.
翻译:环境声音(如脚步声、键盘敲击声或狗叫声)承载着丰富的信息和情感语境,这使其在用户应用中的触觉设计方面具有重要价值。然而,现有的音频到振动转换方法依赖于为音乐或游戏调整的信号处理规则,通常难以在不同类型的声音中实现泛化。为解决此问题,我们首先调查了用户对四种现有音频到触觉算法的感知,随后为环境声音创建了一个数据驱动模型。在研究1中,34名参与者对四种算法为1000种声音生成的振动进行了评分,结果显示用户对算法没有一致的偏好。利用该数据集,我们训练了Sound2Hap——一个基于CNN的自编码器,能够以低延迟从多样声音中生成具有感知意义的振动。在研究2中,15名参与者在音频-振动匹配度和触觉体验指数(HXI)两方面对其输出的评分均高于信号处理基线方法,认为其与多样声音更为协调。这项工作展示了一种经过感知验证的音频-触觉转换方法,拓宽了声音驱动触觉的应用范围。