Identifying which Wikipedia articles are related to science fiction, fantasy, or their hybrids is challenging because genre boundaries are porous and frequently overlap. Wikipedia nonetheless offers machine-readable structure beyond text, including categories, internal links (wikilinks), and statements if corresponding Wikidata items. However, each of these signals reflects community conventions and can be biased or incomplete. This study examines structural and semantic features of Wikipedia articles that can be used to identify content related to science fiction and fantasy (SF/F).
翻译:识别维基百科中哪些文章涉及科幻、奇幻或其混合体裁具有挑战性,因为体裁边界具有渗透性且经常重叠。尽管如此,维基百科在文本之外提供了机器可读的结构,包括分类、内部链接(维基链接),以及对应维基数据项目中的陈述。然而,这些信号均反映了社区惯例,可能存在偏见或不完整。本研究考察了可用于识别与科幻和奇幻(SF/F)相关内容维基百科文章的结构与语义特征。