Due to various and serious adverse impacts of spreading fake news, it is often known that only people with malicious intent would propagate fake news. However, it is not necessarily true based on social science studies. Distinguishing the types of fake news spreaders based on their intent is critical because it will effectively guide how to intervene to mitigate the spread of fake news with different approaches. To this end, we propose an intent classification framework that can best identify the correct intent of fake news. We will leverage deep reinforcement learning (DRL) that can optimize the structural representation of each tweet by removing noisy words from the input sequence when appending an actor to the long short-term memory (LSTM) intent classifier. Policy gradient DRL model (e.g., REINFORCE) can lead the actor to a higher delayed reward. We also devise a new uncertainty-aware immediate reward using a subjective opinion that can explicitly deal with multidimensional uncertainty for effective decision-making. Via 600K training episodes from a fake news tweets dataset with an annotated intent class, we evaluate the performance of uncertainty-aware reward in DRL. Evaluation results demonstrate that our proposed framework efficiently reduces the number of selected words to maintain a high 95\% multi-class accuracy.
翻译:由于传播虚假新闻会带来各种严重负面影响,人们通常认为只有恶意意图的人才会散布虚假新闻。然而,基于社会科学的研究表明,这一观点并不一定成立。根据意图区分虚假新闻传播者的类型至关重要,因为这能有效指导如何采用不同方法进行干预以减缓虚假新闻的传播。为此,我们提出了一种意图分类框架,能够最佳地识别虚假新闻的正确意图。我们将利用深度强化学习,通过在长短期记忆意图分类器上附加行动者,从输入序列中移除噪声词汇,从而优化每条推文的结构表示。策略梯度深度强化学习模型(例如REINFORCE)可以引导行动者获得更高的延迟奖励。我们还利用主观意见设计了一种新的不确定性感知即时奖励,该奖励能够显式处理多维不确定性以实现有效决策。通过从带有标注意图类别的虚假新闻推文数据集中进行60万次训练迭代,我们评估了深度强化学习中不确定性感知奖励的性能。评估结果表明,我们提出的框架有效减少了所选词汇数量,同时保持了95%的多类准确率。