Diagnostic errors remain a major cause of preventable mortality, particularly in resource limited settings. Medical training simulators, including robopatients, help reduce such errors by replicating patient responses during procedures such as abdominal palpation. However, generating realistic multimodal feedback especially auditory pain expressions remains challenging due to the complex, nonlinear relationship between applied palpation forces and perceived pain sounds. The high dimensionality and perceptual variability of pain vocalizations further limit conventional modeling approaches. We propose a novel experimental paradigm for adaptive pain expressivity in robopatients that dynamically generates auditory pain responses to palpation forces using human in the loop machine learning. Specifically, we employ Proximal Policy Optimization (PPO), a reinforcement learning algorithm suited for continuous control, to iteratively refine pain sound generation based on real time human evaluative feedback. The system initializes randomized mappings between force inputs and sound outputs, and the learning agent progressively adjusts them to align with human perceptual preferences. Results show that the framework adapts to individual palpation behaviors and subjective sound preferences while capturing a broad range of perceived pain intensities, from mild discomfort to acute distress. We also observe perceptual saturation at lower force ranges, with gender specific thresholds in pain sound perception. This work demonstrates the feasibility of human in the loop reinforcement learning for co-optimizing haptic input and auditory pain expression in medical simulators, highlighting the potential of adaptive and immersive platforms to enhance palpation training and reduce diagnostic errors.
翻译:诊断错误仍然是可预防死亡的主要原因,尤其在资源有限的环境中。包括机器人在内的医学培训模拟器通过复现患者在接受腹部触诊等操作时的反应,有助于减少此类错误。然而,由于施加的触诊力与感知到的疼痛声音之间存在复杂的非线性关系,生成真实的多模态反馈——尤其是听觉疼痛表达——仍然具有挑战性。疼痛发声的高维性和感知变异性进一步限制了传统建模方法。我们提出了一种新颖的实验范式,用于实现机器人患者的自适应疼痛表达能力,该系统利用人在回路机器学习动态生成对触诊力的听觉疼痛响应。具体而言,我们采用近端策略优化(PPO)——一种适用于连续控制的强化学习算法——基于实时的人类评估反馈迭代优化疼痛声音的生成。该系统初始化力输入与声音输出之间的随机映射,学习智能体逐步调整这些映射以符合人类的感知偏好。结果表明,该框架能够适应个体的触诊行为和主观声音偏好,同时捕捉从轻微不适到急性痛苦等广泛的感知疼痛强度。我们还观察到在较低力范围内的感知饱和现象,以及疼痛声音感知的性别特异性阈值。这项工作证明了人在回路强化学习在医学模拟器中协同优化触觉输入与听觉疼痛表达的可行性,凸显了自适应沉浸式平台在增强触诊培训和减少诊断错误方面的潜力。