E-commerce product catalogs contain billions of items. Most products have lengthy titles, as sellers pack them with product attributes to improve retrieval, and highlight key product aspects. This results in a gap between such unnatural products titles, and how customers refer to them. It also limits how e-commerce stores can use these seller-provided titles for recommendation, QA, or review summarization. Inspired by recent work on instruction-tuned LLMs, we present InstructPTS, a controllable approach for the task of Product Title Summarization (PTS). Trained using a novel instruction fine-tuning strategy, our approach is able to summarize product titles according to various criteria (e.g. number of words in a summary, inclusion of specific phrases, etc.). Extensive evaluation on a real-world e-commerce catalog shows that compared to simple fine-tuning of LLMs, our proposed approach can generate more accurate product name summaries, with an improvement of over 14 and 8 BLEU and ROUGE points, respectively.
翻译:电子商务产品目录包含数十亿件商品。大多数产品标题冗长,因为卖家在其中堆砌产品属性以提升检索效果,并突出关键产品特征。这导致了这类非自然产品标题与客户对其引用方式之间的差距,同时也限制了电商平台将这些卖家提供的标题用于推荐、问答或评论摘要的能力。受近期指令微调大语言模型研究的启发,我们提出了InstructPTS,一种针对产品标题摘要任务(PTS)的可控方法。该方法采用新颖的指令微调策略进行训练,能够根据多种标准(如摘要词数、特定短语包含等)对产品标题进行摘要。在真实电商目录上的广泛评估表明,与大语言模型的简单微调相比,我们提出的方法能生成更准确的产品名称摘要,BLEU和ROUGE分数分别提升了超过14分和8分。