Semantic parsing of user-generated instructional text, in the way of enabling end-users to program the Internet of Things (IoT), is an underexplored area. In this study, we provide a unique annotated corpus which aims to support the transformation of cooking recipe instructions to machine-understandable commands for IoT devices in the kitchen. Each of these commands is a tuple capturing the semantics of an instruction involving a kitchen device in terms of "What", "Where", "Why" and "How". Based on this corpus, we developed machine learning-based sequence labelling methods, namely conditional random fields (CRF) and a neural network model, in order to parse recipe instructions and extract our tuples of interest from them. Our results show that while it is feasible to train semantic parsers based on our annotations, most natural-language instructions are incomplete, and thus transforming them into formal meaning representation, is not straightforward.
翻译:用户生成的指导性文本的语义解析,旨在支持终端用户编程物联网(IoT),这是一个尚未充分探索的领域。在本研究中,我们提供了一个独特的标注语料库,旨在支持将烹饪食谱指令转化为厨房物联网设备可理解的命令。每个命令是一个元组,以“什么”、“哪里”、“为什么”和“如何”的方式捕捉涉及厨房设备的指令语义。基于该语料库,我们开发了基于机器学习的序列标注方法,即条件随机场(CRF)和神经网络模型,以解析食谱指令并提取我们关注的元组。我们的结果表明,虽然基于我们的标注训练语义解析器是可行的,但大多数自然语言指令是不完整的,因此将其转化为形式化意义表示并非易事。