Given the long textual product information and the product image, Multi-modal Product Summarization (MPS) aims to increase customers' desire to purchase by highlighting product characteristics with a short textual summary. Existing MPS methods can produce promising results. Nevertheless, they still 1) lack end-to-end product summarization, 2) lack multi-grained multi-modal modeling, and 3) lack multi-modal attribute modeling. To improve MPS, we propose an end-to-end multi-grained multi-modal attribute-aware product summarization method (MMAPS) for generating high-quality product summaries in e-commerce. MMAPS jointly models product attributes and generates product summaries. We design several multi-grained multi-modal tasks to better guide the multi-modal learning of MMAPS. Furthermore, we model product attributes based on both text and image modalities so that multi-modal product characteristics can be manifested in the generated summaries. Extensive experiments on a real large-scale Chinese e-commence dataset demonstrate that our model outperforms state-of-the-art product summarization methods w.r.t. several summarization metrics. Our code is publicly available at: https://github.com/KDEGroup/MMAPS.
翻译:鉴于商品描述文本较长且包含产品图像,多模态商品摘要(MPS)旨在通过简短文本摘要突出产品特征,从而提升消费者的购买意愿。现有MPS方法可取得一定成效,但依然存在以下不足:1)缺乏端到端的商品摘要生成能力;2)缺乏多粒度多模态建模;3)缺乏多模态属性建模。为改进MPS,我们提出一种端到端的多粒度多模态属性感知商品摘要方法(MMAPS),用于在电子商务中生成高质量的商品摘要。MMAPS联合建模商品属性并生成商品摘要。我们设计了多项多粒度多模态任务,以更好地引导MMAPS的多模态学习。此外,我们基于文本和图像两种模态对商品属性进行建模,使多模态商品特征能够在生成的摘要中得以体现。在真实大规模中文电商数据集上的大量实验表明,我们的模型在多项摘要评价指标上优于现有最优商品摘要方法。我们的代码已开源:https://github.com/KDEGroup/MMAPS。