We propose a general method to break down a main complex task into a set of intermediary easier sub-tasks, which are formulated in natural language as binary questions related to the final target task. Our method allows for representing each example by a vector consisting of the answers to these questions. We call this representation Natural Language Learned Features (NLLF). NLLF is generated by a small transformer language model (e.g., BERT) that has been trained in a Natural Language Inference (NLI) fashion, using weak labels automatically obtained from a Large Language Model (LLM). We show that the LLM normally struggles for the main task using in-context learning, but can handle these easiest subtasks and produce useful weak labels to train a BERT. The NLI-like training of the BERT allows for tackling zero-shot inference with any binary question, and not necessarily the ones seen during the training. We show that this NLLF vector not only helps to reach better performances by enhancing any classifier, but that it can be used as input of an easy-to-interpret machine learning model like a decision tree. This decision tree is interpretable but also reaches high performances, surpassing those of a pre-trained transformer in some cases.We have successfully applied this method to two completely different tasks: detecting incoherence in students' answers to open-ended mathematics exam questions, and screening abstracts for a systematic literature review of scientific papers on climate change and agroecology.
翻译:我们提出了一种通用方法,将一个复杂的主任务分解为一系列较简单的中间子任务,这些子任务以自然语言形式表述为与最终目标任务相关的二元问题。我们的方法允许通过每个示例对这些问题的回答组成的向量来表示该示例,我们称这种表示为自然语言学习特征(NLLF)。NLLF由一个小型Transformer语言模型(如BERT)生成,该模型以自然语言推理(NLI)方式训练,使用从大型语言模型(LLM)自动获取的弱标签。研究表明,LLM通常在基于上下文学习的主任务中表现不佳,但能够处理这些最简单的子任务,并生成有用的弱标签来训练BERT。BERT的类NLI训练使其能够应对任意二元问题的零样本推理,而不仅仅是训练中见过的问题。我们发现,NLLF向量不仅通过增强任何分类器来提高性能,而且可作为易于解释的机器学习模型(如决策树)的输入。这种决策树具有可解释性,同时达到高性能,在某些情况下甚至超越预训练Transformer。我们已成功将本方法应用于两个截然不同的任务:检测学生在开放式数学考试答案中的不一致性,以及筛选气候变化与农业生态学科学论文的系统性文献综述的摘要。