Telegram has become one of the leading platforms for disseminating misinformational messages. However, many existing pipelines still classify each message's credibility based on the reputation of its associated domain names or its lexical features. Such methods work well on traditional long-form news articles published by well-known sources, but high-risk posts on Telegram are short and URL-sparse, leading to failures for link-based and standard TF-IDF models. To this end, we propose the TAG2CRED pipeline, a method designed for such short, convoluted messages. Our model will directly score each post based on the tags assigned to the text. We designed a concise label system that covers the dimensions of theme, claim type, call to action, and evidence. The fine-tuned large language model (LLM) assigns tags to messages and then maps these tags to calibrated risk scores in the [0,1] interval through L2-regularized logistic regression. We evaluated 87,936 Telegram messages associated with Media Bias/Fact Check (MBFC), using URL masking and domain disjoint splits. The results showed that the ROC-AUC of the TAG2CRED model reached 0.871, the macro-F1 value was 0.787, and the Brier score was 0.167, outperforming the baseline TF-IDF (macro-F1 value 0.737, Brier score 0.248); at the same time, the number of features used in this model is much smaller, and the generalization ability on infrequent domains is stronger. The performance of the stacked ensemble model (TF-IDF + TAG2CRED + SBERT) was further improved over the baseline SBERT. ROC-AUC reached 0.901, and the macro-F1 value was 0.813 (Brier score 0.114). This indicates that style labels and lexical features may capture different but complementary dimensions of information risk.
翻译:Telegram已成为传播误导性信息的主要平台之一。然而,现有许多流程仍基于关联域名的声誉或词汇特征来判定每条消息的可信度。此类方法对知名来源发布的传统长篇幅新闻报道效果良好,但Telegram上的高风险帖子通常篇幅短小且URL稀疏,导致基于链接的方法和标准TF-IDF模型失效。为此,我们提出了TAG2CRED流程,这是一种专为此类简短、复杂消息设计的方法。我们的模型将直接依据文本被赋予的标签对每条帖子进行评分。我们设计了一套简洁的标签系统,涵盖主题、主张类型、行动号召和证据等维度。经过微调的大语言模型(LLM)为消息分配标签,随后通过L2正则化逻辑回归将这些标签映射到[0,1]区间内的校准风险分数。我们使用URL掩码和域分离划分方法,评估了与Media Bias/Fact Check(MBFC)关联的87,936条Telegram消息。结果表明:TAG2CRED模型的ROC-AUC达到0.871,宏平均F1值为0.787,Brier分数为0.167,优于基线TF-IDF模型(宏平均F1值0.737,Brier分数0.248);同时该模型使用的特征数量显著减少,在低频域名上的泛化能力更强。堆叠集成模型(TF-IDF + TAG2CRED + SBERT)的性能较基线SBERT得到进一步提升,ROC-AUC达到0.901,宏平均F1值为0.813(Brier分数0.114)。这表明风格标签与词汇特征可能捕捉到信息风险中不同但互补的维度。