Recent advances in machine learning, particularly deep learning, have enabled autonomous systems to perceive and comprehend objects and their environments in a perceptual subsymbolic manner. These systems can now perform object detection, sensor data fusion, and language understanding tasks. However, there is a growing need to enhance these systems to understand objects and their environments more conceptually and symbolically. It is essential to consider both the explicit teaching provided by humans (e.g., describing a situation or explaining how to act) and the implicit teaching obtained by observing human behavior (e.g., through the system's sensors) to achieve this level of powerful artificial intelligence. Thus, the system must be designed with multimodal input and output capabilities to support implicit and explicit interaction models. In this position paper, we argue for considering both types of inputs, as well as human-in-the-loop and incremental learning techniques, for advancing the field of artificial intelligence and enabling autonomous systems to learn like humans. We propose several hypotheses and design guidelines and highlight a use case from related work to achieve this goal.
翻译:近年来,机器学习(特别是深度学习)的进展使自主系统能够以感知子符号方式感知和理解物体及其环境。这些系统现已能执行目标检测、传感器数据融合及语言理解任务。然而,当前亟需提升系统对物体及其环境的概念化与符号化理解能力。为实现这一强人工智能目标,必须兼顾人类提供的显性教导(如情景描述或行动解释)与通过观察人类行为(如通过系统传感器)获得的隐性教导。因此,系统需具备多模态输入输出能力,以支持隐性与显性交互模型。在本立场论文中,我们主张融合两类输入,并引入人在回路与增量学习技术,以推动人工智能发展,使自主系统能像人类一样学习。我们提出若干假设与设计准则,并引用相关工作中的典型案例来阐明实现路径。