Target similarity tuning (TST) is a method of selecting relevant examples in natural language (NL) to code generation through large language models (LLMs) to improve performance. Its goal is to adapt a sentence embedding model to have the similarity between two NL inputs match the similarity between their associated code outputs. In this paper, we propose different methods to apply and improve TST in the real world. First, we replace the sentence transformer with embeddings from a larger model, which reduces sensitivity to the language distribution and thus provides more flexibility in synthetic generation of examples, and we train a tiny model that transforms these embeddings to a space where embedding similarity matches code similarity, which allows the model to remain a black box and only requires a few matrix multiplications at inference time. Second, we how to efficiently select a smaller number of training examples to train the TST model. Third, we introduce a ranking-based evaluation for TST that does not require end-to-end code generation experiments, which can be expensive to perform.
翻译:目标相似性调优(TST)是一种通过大语言模型(LLM)从自然语言(NL)到代码生成任务中筛选相关示例以提升性能的方法。其目标是调整句子嵌入模型,使得两个自然语言输入之间的相似性与其对应代码输出的相似性相匹配。本文提出了多种在真实场景中应用并改进TST的方法。首先,我们采用更大模型的嵌入替换句子转换器(sentence transformer),以降低对语言分布的敏感性,从而在示例的合成生成中提供更高灵活性;同时训练一个轻量模型,将这些嵌入映射至嵌入相似性与代码相似性匹配的空间,使原始模型保持黑箱状态且推理时仅需少量矩阵乘法运算。其次,我们研究了如何高效筛选少量训练样本来训练TST模型。第三,我们引入了一种基于排序的TST评估方法,无需进行成本高昂的端到端代码生成实验。