Query encoding (QE) is proposed as a fast and robust solution to CQA. In the encoding process, most existing QE methods first parse the logical query into an executable computational direct-acyclic graph (DAG), then use neural networks to parameterize the operators, and finally, recursively execute these neuralized operators. However, the parameterization-and-execution paradigm may be potentially over-complicated, as it can be structurally simplified by a single neural network encoder. Meanwhile, sequence encoders, like LSTM and Transformer, proved to be effective for encoding semantic graphs in related tasks. Motivated by this, we propose sequential query encoding (SQE) as an alternative to encode queries for CQA. Instead of parameterizing and executing the computational graph, SQE first uses a search-based algorithm to linearize the computational graph to a sequence of tokens and then uses a sequence encoder to compute its vector representation. Then this vector representation is used as a query embedding to retrieve answers from the embedding space according to similarity scores. Despite its simplicity, SQE demonstrates state-of-the-art neural query encoding performance on FB15k, FB15k-237, and NELL on an extended benchmark including twenty-nine types of in-distribution queries. Further experiment shows that SQE also demonstrates comparable knowledge inference capability on out-of-distribution queries, whose query types are not observed during the training process.
翻译:查询编码(QE)被提出作为复杂查询解答的一种快速且鲁棒的解决方案。在编码过程中,大多数现有QE方法首先将逻辑查询解析为可执行的计算有向无环图(DAG),然后使用神经网络对算子进行参数化,最后递归执行这些神经化算子。然而,这种参数化与执行的范式可能过于复杂,因为它可以通过单个神经网络编码器在结构上进行简化。同时,序列编码器(如LSTM和Transformer)已被证明能够有效编码相关任务中的语义图。受此启发,我们提出序列查询编码(SQE)作为面向复杂查询解答的查询编码替代方案。SQE不进行参数化与执行计算图,而是首先使用基于搜索的算法将计算图线性化为一个标记序列,然后使用序列编码器计算其向量表示。随后,该向量表示被用作查询嵌入,根据相似度分数从嵌入空间中检索答案。尽管方法简单,SQE在包含二十九种分布内查询类型的扩展基准测试中,在FB15k、FB15k-237和NELL数据集上展现了最先进的神经查询编码性能。进一步实验表明,SQE在训练过程中未出现的分布外查询类型上也展现出可比较的知识推理能力。