Generating realistic and user-preferred advertisements is a key challenge in e-commerce. Existing approaches utilize multiple independent models driven by click-through-rate (CTR) to controllably create attractive image or text advertisements. However, their pipelines lack cross-modal perception and rely on CTR that only reflects average preferences. Therefore, we explore jointly generating personalized image-text advertisements from historical click behaviors. We first design a Unified Advertisement Generative model (Uni-AdGen) that employs a single autoregressive framework to produce both advertising images and texts. By incorporating a foreground perception module and instruction tuning, Uni-AdGen enhances the realism of the generated content. To further personalize advertisements, we equip Uni-AdGen with a coarse-to-fine preference understanding module that effectively captures user interests from noisy multimodal historical behaviors to drive personalized generation. Additionally, we construct the first large-scale Personalized Advertising image-text dataset (PAd1M) and introduce a Product Background Similarity (PBS) metric to facilitate training and evaluation. Extensive experiments show that our method outperforms baselines in general and personalized advertisement generation. Our project is available at https://github.com/JD-GenX/Uni-AdGen.
翻译:生成符合用户喜好的真实广告是电子商务中的关键挑战。现有方法利用多个基于点击率(CTR)驱动的独立模型,可控地生成具有吸引力的图像或文本广告。然而,这些流程缺乏跨模态感知能力,且依赖仅反映平均偏好的CTR指标。为此,我们探索从历史点击行为中联合生成个性化图文广告。首先设计统一广告生成模型(Uni-AdGen),该模型采用单一自回归框架同时生成广告图像和文本。通过引入前景感知模块与指令微调技术,Uni-AdGen增强了生成内容的真实性。为进一步实现广告个性化,我们为Uni-AdGen配备从粗到细的偏好理解模块,该模块能从包含噪声的多模态历史行为中有效捕捉用户兴趣,从而驱动个性化生成。此外,我们构建了首个大规模个性化广告图文数据集(PAd1M),并提出了产品背景相似度(PBS)指标以促进训练与评估。大量实验表明,本方法在通用广告生成与个性化广告生成任务中均优于基线模型。项目代码已开源至 https://github.com/JD-GenX/Uni-AdGen。