Current methods for generating attractive headlines often learn directly from data, which bases attractiveness on the number of user clicks and views. Although clicks or views do reflect user interest, they can fail to reveal how much interest is raised by the writing style and how much is due to the event or topic itself. Also, such approaches can lead to harmful inventions by over-exaggerating the content, aggravating the spread of false information. In this work, we propose HonestBait, a novel framework for solving these issues from another aspect: generating headlines using forward references (FRs), a writing technique often used for clickbait. A self-verification process is included during training to avoid spurious inventions. We begin with a preliminary user study to understand how FRs affect user interest, after which we present PANCO1, an innovative dataset containing pairs of fake news with verified news for attractive but faithful news headline generation. Automatic metrics and human evaluations show that our framework yields more attractive results (+11.25% compared to human-written verified news headlines) while maintaining high veracity, which helps promote real information to fight against fake news.
翻译:当前生成吸引人标题的方法通常直接从数据中学习,其吸引力基于用户点击和浏览次数。尽管点击或浏览确实反映了用户兴趣,但无法揭示其中有多少兴趣是由写作风格引发的,又有多少是由事件或话题本身造成的。此外,此类方法可能导致有害的虚构内容,过度夸大事实,加剧虚假信息的传播。在本工作中,我们提出HonestBait,一种解决这些问题的全新框架,其从另一角度入手:利用前向引用(一种常用于点击诱饵的写作技巧)生成标题。训练过程中包含自验证机制以避免虚假编造。我们首先通过初步用户研究了解前向引用如何影响用户兴趣,随后提出PANCO1,一个创新数据集,包含配对虚假新闻与已验证新闻,用于生成既吸引人又忠实的新闻标题。自动评估指标与人工评价表明,我们的框架在保持高真实性的同时,生成了更具吸引力的结果(相较于人工编写的已验证新闻标题提升11.25%),这有助于推广真实信息以对抗虚假新闻。