Multi-Intent Detection in User Provided Annotations for Programming by Examples Systems

In mapping enterprise applications, data mapping remains a fundamental part of integration development, but its time consuming. An increasing number of applications lack naming standards, and nested field structures further add complexity for the integration developers. Once the mapping is done, data transformation is the next challenge for the users since each application expects data to be in a certain format. Also, while building integration flow, developers need to understand the format of the source and target data field and come up with transformation program that can change data from source to target format. The problem of automatic generation of a transformation program through program synthesis paradigm from some specifications has been studied since the early days of Artificial Intelligence (AI). Programming by Example (PBE) is one such kind of technique that targets automatic inferencing of a computer program to accomplish a format or string conversion task from user-provided input and output samples. To learn the correct intent, a diverse set of samples from the user is required. However, there is a possibility that the user fails to provide a diverse set of samples. This can lead to multiple intents or ambiguity in the input and output samples. Hence, PBE systems can get confused in generating the correct intent program. In this paper, we propose a deep neural network based ambiguity prediction model, which analyzes the input-output strings and maps them to a different set of properties responsible for multiple intent. Users can analyze these properties and accordingly can provide new samples or modify existing samples which can help in building a better PBE system for mapping enterprise applications.

翻译：在企业应用映射中，数据映射仍是集成开发的基础环节，但颇为耗时。越来越多的应用缺乏命名规范，而嵌套字段结构进一步增加了集成开发者的复杂性。完成映射后，数据转换成为用户面临的下一个挑战，因为每个应用都要求数据采用特定格式。同时，在构建集成流程时，开发者需理解源和目标数据字段的格式，并设计出能实现数据从源格式到目标格式转换的程序。自人工智能早期以来，通过程序合成范式从某些规范自动生成转换程序的问题便一直是研究热点。编程示例（PBE）便是这样一种技术，旨在从用户提供的输入/输出样本中自动推断计算机程序，以完成格式或字符串转换任务。为学习正确意图，系统需要用户提供多样化的样本集。然而，用户可能未能提供足够多样化的样本，这会导致输入/输出样本存在多重意图或歧义，进而使PBE系统在生成正确意图程序时陷入困惑。本文提出一种基于深度神经网络的歧义预测模型，该模型通过分析输入-输出字符串，将其映射至导致多重意图的不同属性集。用户可据此分析这些属性，并提供新样本或修改现有样本，从而助力构建更优的企业应用映射PBE系统。