Posts, as important containers of user-generated-content pieces on social media, are of tremendous social influence and commercial value. As an integral components of a post, the headline has a decisive contribution to the post's popularity. However, current mainstream method for headline generation is still manually writing, which is unstable and requires extensive human effort. This drives us to explore a novel research question: Can we automate the generation of popular headlines on social media? We collect more than 1 million posts of 42,447 celebrities from public data of Xiaohongshu, which is a well-known social media platform in China. We then conduct careful observations on the headlines of these posts. Observation results demonstrate that trends and personal styles are widespread in headlines on social medias and have significant contribution to posts's popularity. Motivated by these insights, we present MEBART, which combines Multiple preference-Extractors with Bidirectional and Auto-Regressive Transformers (BART), capturing trends and personal styles to generate popular headlines on social medias. We perform extensive experiments on real-world datasets and achieve state-of-the-art performance compared with several advanced baselines. In addition, ablation and case studies demonstrate that MEBART advances in capturing trends and personal styles.
翻译:帖子作为社交媒体上用户生成内容的重要载体,具有巨大的社会影响力和商业价值。作为帖子的组成部分,标题对帖子的受欢迎程度具有决定性贡献。然而,当前主流的标题生成方法仍依赖于人工撰写,这种方法不稳定且需要大量人力投入。这促使我们探索一个新颖的研究问题:能否自动化生成社交媒体上的热门标题?我们从中国知名社交媒体平台小红书的公开数据中,收集了42,447位名人的超过100万条帖子。随后对这些帖子的标题进行细致观察。观察结果表明,趋势和个人风格在社交媒体标题中普遍存在,并对帖子的受欢迎程度有显著贡献。受这些发现启发,我们提出了MEBART模型,该模型将多重偏好提取器与双向自回归变换器(BART)相结合,通过捕捉趋势和个人风格来生成社交媒体上的热门标题。我们在真实数据集上进行了大量实验,并与多种先进基线方法相比,取得了最先进的性能。此外,消融研究和案例分析表明,MEBART在捕捉趋势和个人风格方面具有优势。