Over the past decade, significant advances have been made in the field of image search for e-commerce applications. Traditional image-to-image retrieval models, which focus solely on image details such as texture, tend to overlook useful semantic information contained within the images. As a result, the retrieved products might possess similar image details, but fail to fulfil the user's search goals. Moreover, the use of image-to-image retrieval models for products containing multiple images results in significant online product feature storage overhead and complex mapping implementations. In this paper, we report the design and deployment of the proposed Multi-modal Item Embedding Model (MIEM) to address these limitations. It is capable of utilizing both textual information and multiple images about a product to construct meaningful product features. By leveraging semantic information from images, MIEM effectively supplements the image search process, improving the overall accuracy of retrieval results. MIEM has become an integral part of the Shopee image search platform. Since its deployment in March 2023, it has achieved a remarkable 9.90% increase in terms of clicks per user and a 4.23% boost in terms of orders per user for the image search feature on the Shopee e-commerce platform.
翻译:过去十年间,电商领域的图像搜索技术取得了显著进展。传统图像到图像检索模型仅关注纹理等图像细节,容易忽略图像中包含的有用语义信息。这导致检索结果可能呈现相似的图像细节,却未能满足用户的实际搜索目标。此外,针对含有多张图片的商品使用图像到图像检索模型,会造成大量在线商品特征存储开销与复杂的映射实现。本文报告了我们设计并部署的多模态商品嵌入模型(MIEM)以解决上述局限。该模型能够同时利用商品的文本信息和多张图像来构建有意义的商品特征。通过挖掘图像的语义信息,MIEM有效补充了图像搜索过程,提升了检索结果的整体准确率。目前MIEM已成为东南亚电商平台Shopee图像搜索系统的核心组件。自2023年3月部署以来,该模型使Shopee电商平台图像搜索功能的每位用户点击量提升了9.90%,每位用户订单量提升了4.23%。