文化分析向善：构建历史信息检索的包容性评估框架 (Cultural Analytics for Good: Building Inclusive Evaluation Frameworks for Historical IR) - 专知论文

会员服务 ·

0

构建 · 知识 · 分析 · 信息检索 · 图书 ·

Cultural Analytics for Good: Building Inclusive Evaluation Frameworks for Historical IR

翻译：文化分析向善：构建历史信息检索的包容性评估框架

Suchana Datta,Dwaipayan Roy,Derek Greene,Gerardine Meaney,Karen Wade,Philipp Mayr

This work bridges the fields of information retrieval and cultural analytics to support equitable access to historical knowledge. Using the British Library BL19 digital collection (more than 35,000 works from 1700-1899), we construct a benchmark for studying changes in language, terminology and retrieval in the 19th-century fiction and non-fiction. Our approach combines expert-driven query design, paragraph-level relevance annotation, and Large Language Model (LLM) assistance to create a scalable evaluation framework grounded in human expertise. We focus on knowledge transfer from fiction to non-fiction, investigating how narrative understanding and semantic richness in fiction can improve retrieval for scholarly and factual materials. This interdisciplinary framework not only improves retrieval accuracy but also fosters interpretability, transparency, and cultural inclusivity in digital archives. Our work provides both practical evaluation resources and a methodological paradigm for developing retrieval systems that support richer, historically aware engagement with digital archives, ultimately working towards more emancipatory knowledge infrastructures.

翻译：本研究将信息检索与文化分析领域相结合，以支持对历史知识的公平获取。利用大英图书馆BL19数字馆藏（涵盖1700-1899年间超过35,000部作品），我们构建了一个用于研究19世纪小说与非虚构作品中语言、术语及检索变迁的基准。我们的方法融合了专家驱动的查询设计、段落级相关性标注以及大语言模型（LLM）辅助，创建了一个基于人类专业知识且可扩展的评估框架。我们重点关注从小说到非虚构作品的知识迁移，探究小说中的叙事理解与语义丰富性如何提升学术与事实材料的检索效果。这一跨学科框架不仅提高了检索准确性，还促进了数字档案的可解释性、透明性与文化包容性。我们的工作既提供了实用的评估资源，也为开发支持更丰富、更具历史意识的数字档案交互的检索系统提供了方法论范式，最终致力于构建更具解放性的知识基础设施。

0

相关内容

大型语言模型赋能的推荐与搜索智能体综述：迈向下一代信息检索

大型语言模型赋能的推荐与搜索智能体综述：迈向下一代信息检索

专知会员服务

36+阅读 · 2025年3月10日

【干货书】大规模文本数据的结构化知识挖掘，200页pdf

【干货书】大规模文本数据的结构化知识挖掘，200页pdf

专知会员服务

70+阅读 · 2022年8月20日

AI预测历史？DeepMind 又发nature！使用Ithaca深度神经网络恢复和归因古代文本

AI预测历史？DeepMind 又发nature！使用Ithaca深度神经网络恢复和归因古代文本

专知会员服务

25+阅读 · 2022年3月10日

【论文推荐】层次知识图谱，Hierarchical Knowledge Graphs: A Novel Information Representation for Exploratory Search Tasks

【论文推荐】层次知识图谱，Hierarchical Knowledge Graphs: A Novel Information Representation for Exploratory Search Tasks

专知会员服务

49+阅读 · 2020年5月26日

【IJCAI2020-南京大学】用紧凑、有代表性的相关知识图谱丰富文档，Enriching Documents with Compact, Representative, Relevant Knowledge Graphs

【IJCAI2020-南京大学】用紧凑、有代表性的相关知识图谱丰富文档，Enriching Documents with Compact, Representative, Relevant Knowledge Graphs

专知会员服务

17+阅读 · 2020年5月4日

【论文推荐】联邦学习的个性化技术综述，Survey of Personalization Techniques for Federated Learning

【论文推荐】联邦学习的个性化技术综述，Survey of Personalization Techniques for Federated Learning

专知会员服务

79+阅读 · 2020年3月19日

【牛津大学-DeepMind 】上下文嵌入综述，A Survey on Contextual Embeddings

【牛津大学-DeepMind 】上下文嵌入综述，A Survey on Contextual Embeddings

专知会员服务

42+阅读 · 2020年3月17日

数据挖掘大拿韩家炜：从非结构化文本到知识立方TextCube：自动化构建和多维探索

数据挖掘大拿韩家炜：从非结构化文本到知识立方TextCube：自动化构建和多维探索

专知会员服务

101+阅读 · 2019年12月28日

【KDD2019|讲座推荐】从海量文本中构建和挖掘异构信息网络：Constructing and Mining Heterogeneous Information Networks from Massive Text

【KDD2019|讲座推荐】从海量文本中构建和挖掘异构信息网络：Constructing and Mining Heterogeneous Information Networks from Massive Text

专知会员服务

47+阅读 · 2019年12月11日

【ACL 2019 Tutorials】政治文本的计算性分析：沟通不同领域的研究成果（Computational Analysis of Political Texts: Bridging Research Efforts Across Communities），GoranGlavaš,Federico Nanni,Simone Paolo Ponzetto

【ACL 2019 Tutorials】政治文本的计算性分析：沟通不同领域的研究成果（Computational Analysis of Political Texts: Bridging Research Efforts Across Communities），GoranGlavaš,Federico Nanni,Simone Paolo Ponzetto

专知会员服务

10+阅读 · 2019年11月17日

知识图谱|最近三年知识图谱在动态以及时间预测与补全上必读的6篇论文（收藏一下）

知识图谱|最近三年知识图谱在动态以及时间预测与补全上必读的6篇论文（收藏一下）

AINLP

75+阅读 · 2020年1月14日

【AAAI2020论文】多轮对话系统中的历史自适应知识融合机制, 中科院信工所孙雅静等

【AAAI2020论文】多轮对话系统中的历史自适应知识融合机制, 中科院信工所孙雅静等

专知

30+阅读 · 2019年11月24日

近期值得读的知识图谱论文，这里帮你总结好了

近期值得读的知识图谱论文，这里帮你总结好了

PaperWeekly

33+阅读 · 2019年9月3日

推荐系统资源(文献、工具、框架)整理

推荐系统资源(文献、工具、框架)整理

专知

18+阅读 · 2019年2月4日

【论文推荐】最新八篇推荐系统相关论文—可解释推荐、上下文感知推荐系统、异构知识库嵌入、深度强化学习、移动推荐系统

【论文推荐】最新八篇推荐系统相关论文—可解释推荐、上下文感知推荐系统、异构知识库嵌入、深度强化学习、移动推荐系统

专知

17+阅读 · 2018年6月16日

【论文推荐】最新六篇知识图谱相关论文—事件演化图、神经词义消歧、增强神经网络、Mem2Seq、用户偏好传播、概率嵌入

【论文推荐】最新六篇知识图谱相关论文—事件演化图、神经词义消歧、增强神经网络、Mem2Seq、用户偏好传播、概率嵌入

专知

19+阅读 · 2018年6月14日

【论文推荐】最新八篇知识图谱相关论文—全卷积网络、结构化知识图谱、关系结构表示、情感分析、可解释和组合关系学习

【论文推荐】最新八篇知识图谱相关论文—全卷积网络、结构化知识图谱、关系结构表示、情感分析、可解释和组合关系学习

专知

24+阅读 · 2018年6月12日

【论文推荐】最新十二篇情感分析相关论文—自然语言推理框架、网络事件、多任务学习、实时情感变化检测、多因素分析、深度语境词表示

【论文推荐】最新十二篇情感分析相关论文—自然语言推理框架、网络事件、多任务学习、实时情感变化检测、多因素分析、深度语境词表示

专知

22+阅读 · 2018年5月7日

【论文推荐】最新七篇知识图谱相关论文—知识表示学习、增强神经网络、链接预测、关系预测与提取、综述、递归特性生成、深度知识感知网络

【论文推荐】最新七篇知识图谱相关论文—知识表示学习、增强神经网络、链接预测、关系预测与提取、综述、递归特性生成、深度知识感知网络

专知

29+阅读 · 2018年3月6日

【知识图谱】中文知识图谱构建方法研究

【知识图谱】中文知识图谱构建方法研究

产业智能官

99+阅读 · 2017年10月26日

汉英篇章衔接对齐资源构建与分析研究

国家自然科学基金

2+阅读 · 2015年12月31日

面向异构数据库的查询语言设计及其基础理论研究

国家自然科学基金

1+阅读 · 2015年12月31日

共现潜在语义向量空间模型及其语义核的构建与应用研究

国家自然科学基金

1+阅读 · 2015年12月31日

中文社交化短文本情感分析与话题挖掘研究

国家自然科学基金

3+阅读 · 2015年12月31日

面向大数据的群体偏好决策分析研究

国家自然科学基金

6+阅读 · 2014年12月31日

面向事件分析的信息意图检测、建模与群体意图推理技术研究

国家自然科学基金

12+阅读 · 2014年12月31日

上市公司文本信息分析研究：基于大数据的视角

国家自然科学基金

8+阅读 · 2014年12月31日

特征-知识融合的考古遗址时空重建与分析方法研究

国家自然科学基金

1+阅读 · 2014年12月31日

基于领域知识和链路预测的个性化推荐研究

国家自然科学基金

4+阅读 · 2014年12月31日

面向词汇功能的学术文本语义识别与知识图谱构建

国家自然科学基金

5+阅读 · 2014年12月31日

Conceptual Cultural Index: A Metric for Cultural Specificity via Relative Generality

Arxiv

0+阅读 · 2月10日

Exploring Collaborative Immersive Visualization & Analytics for High-Dimensional Scientific Data through Domain Expert Perspectives

Arxiv

0+阅读 · 2月4日

Large Language Model and Formal Concept Analysis: a comparative study for Topic Modeling

Arxiv

0+阅读 · 2月2日

Culturally Grounded Personas in Large Language Models: Characterization and Alignment with Socio-Psychological Value Frameworks

Arxiv

0+阅读 · 1月29日

Beyond Correctness: Evaluating Subjective Writing Preferences Across Cultures

Arxiv

0+阅读 · 1月28日

Chasing Meaning and/or Insight? A Survey on Evaluation Practices at the Intersection of Visualization and the Humanities

Arxiv

0+阅读 · 1月28日

Large-Scale Multidimensional Knowledge Profiling of Scientific Literature

Arxiv

0+阅读 · 1月21日

Archives, archival bond, and digital representation: A case study with the International Image Interoperability Framework

Arxiv

0+阅读 · 1月21日

XCR-Bench: A Multi-Task Benchmark for Evaluating Cultural Reasoning in LLMs

Arxiv

0+阅读 · 1月20日

A Survey on Knowledge Graphs: Representation, Acquisition and Applications

Arxiv

93+阅读 · 2020年2月2日

VIP会员

文章信息

相关主题

相关VIP内容

大型语言模型赋能的推荐与搜索智能体综述：迈向下一代信息检索

大型语言模型赋能的推荐与搜索智能体综述：迈向下一代信息检索

专知会员服务

36+阅读 · 2025年3月10日

【干货书】大规模文本数据的结构化知识挖掘，200页pdf

【干货书】大规模文本数据的结构化知识挖掘，200页pdf

专知会员服务

70+阅读 · 2022年8月20日

AI预测历史？DeepMind 又发nature！使用Ithaca深度神经网络恢复和归因古代文本

AI预测历史？DeepMind 又发nature！使用Ithaca深度神经网络恢复和归因古代文本

专知会员服务

25+阅读 · 2022年3月10日

【论文推荐】层次知识图谱，Hierarchical Knowledge Graphs: A Novel Information Representation for Exploratory Search Tasks

【论文推荐】层次知识图谱，Hierarchical Knowledge Graphs: A Novel Information Representation for Exploratory Search Tasks

专知会员服务

49+阅读 · 2020年5月26日

【IJCAI2020-南京大学】用紧凑、有代表性的相关知识图谱丰富文档，Enriching Documents with Compact, Representative, Relevant Knowledge Graphs

【IJCAI2020-南京大学】用紧凑、有代表性的相关知识图谱丰富文档，Enriching Documents with Compact, Representative, Relevant Knowledge Graphs

专知会员服务

17+阅读 · 2020年5月4日

【论文推荐】联邦学习的个性化技术综述，Survey of Personalization Techniques for Federated Learning

【论文推荐】联邦学习的个性化技术综述，Survey of Personalization Techniques for Federated Learning

专知会员服务

79+阅读 · 2020年3月19日

【牛津大学-DeepMind 】上下文嵌入综述，A Survey on Contextual Embeddings

【牛津大学-DeepMind 】上下文嵌入综述，A Survey on Contextual Embeddings

专知会员服务

42+阅读 · 2020年3月17日

数据挖掘大拿韩家炜：从非结构化文本到知识立方TextCube：自动化构建和多维探索

数据挖掘大拿韩家炜：从非结构化文本到知识立方TextCube：自动化构建和多维探索

专知会员服务

101+阅读 · 2019年12月28日

【KDD2019|讲座推荐】从海量文本中构建和挖掘异构信息网络：Constructing and Mining Heterogeneous Information Networks from Massive Text

【KDD2019|讲座推荐】从海量文本中构建和挖掘异构信息网络：Constructing and Mining Heterogeneous Information Networks from Massive Text

专知会员服务

47+阅读 · 2019年12月11日

【ACL 2019 Tutorials】政治文本的计算性分析：沟通不同领域的研究成果（Computational Analysis of Political Texts: Bridging Research Efforts Across Communities），GoranGlavaš,Federico Nanni,Simone Paolo Ponzetto

【ACL 2019 Tutorials】政治文本的计算性分析：沟通不同领域的研究成果（Computational Analysis of Political Texts: Bridging Research Efforts Across Communities），GoranGlavaš,Federico Nanni,Simone Paolo Ponzetto

专知会员服务

10+阅读 · 2019年11月17日

热门VIP内容

开通专知VIP会员享更多权益服务

【CMU博士论文】基于自适应表征的高效视觉建模

《多域作战中融合网络、电子战与动能机动》

AI智能体时代大模型安全风险与攻防新挑战

迈向个性化大语言模型驱动的智能体：基础、评估与未来方向

相关资讯

知识图谱|最近三年知识图谱在动态以及时间预测与补全上必读的6篇论文（收藏一下）

知识图谱|最近三年知识图谱在动态以及时间预测与补全上必读的6篇论文（收藏一下）

AINLP

75+阅读 · 2020年1月14日

【AAAI2020论文】多轮对话系统中的历史自适应知识融合机制, 中科院信工所孙雅静等

【AAAI2020论文】多轮对话系统中的历史自适应知识融合机制, 中科院信工所孙雅静等

专知

30+阅读 · 2019年11月24日

近期值得读的知识图谱论文，这里帮你总结好了

近期值得读的知识图谱论文，这里帮你总结好了

PaperWeekly

33+阅读 · 2019年9月3日

推荐系统资源(文献、工具、框架)整理

推荐系统资源(文献、工具、框架)整理

专知

18+阅读 · 2019年2月4日

【论文推荐】最新八篇推荐系统相关论文—可解释推荐、上下文感知推荐系统、异构知识库嵌入、深度强化学习、移动推荐系统

【论文推荐】最新八篇推荐系统相关论文—可解释推荐、上下文感知推荐系统、异构知识库嵌入、深度强化学习、移动推荐系统

专知

17+阅读 · 2018年6月16日

【论文推荐】最新六篇知识图谱相关论文—事件演化图、神经词义消歧、增强神经网络、Mem2Seq、用户偏好传播、概率嵌入

【论文推荐】最新六篇知识图谱相关论文—事件演化图、神经词义消歧、增强神经网络、Mem2Seq、用户偏好传播、概率嵌入

专知

19+阅读 · 2018年6月14日

【论文推荐】最新八篇知识图谱相关论文—全卷积网络、结构化知识图谱、关系结构表示、情感分析、可解释和组合关系学习

【论文推荐】最新八篇知识图谱相关论文—全卷积网络、结构化知识图谱、关系结构表示、情感分析、可解释和组合关系学习

专知

24+阅读 · 2018年6月12日

【论文推荐】最新十二篇情感分析相关论文—自然语言推理框架、网络事件、多任务学习、实时情感变化检测、多因素分析、深度语境词表示

【论文推荐】最新十二篇情感分析相关论文—自然语言推理框架、网络事件、多任务学习、实时情感变化检测、多因素分析、深度语境词表示

专知

22+阅读 · 2018年5月7日

【论文推荐】最新七篇知识图谱相关论文—知识表示学习、增强神经网络、链接预测、关系预测与提取、综述、递归特性生成、深度知识感知网络

【论文推荐】最新七篇知识图谱相关论文—知识表示学习、增强神经网络、链接预测、关系预测与提取、综述、递归特性生成、深度知识感知网络

专知

29+阅读 · 2018年3月6日

【知识图谱】中文知识图谱构建方法研究

【知识图谱】中文知识图谱构建方法研究

产业智能官

99+阅读 · 2017年10月26日

相关论文

Conceptual Cultural Index: A Metric for Cultural Specificity via Relative Generality

Arxiv

0+阅读 · 2月10日

Exploring Collaborative Immersive Visualization & Analytics for High-Dimensional Scientific Data through Domain Expert Perspectives

Arxiv

0+阅读 · 2月4日

Large Language Model and Formal Concept Analysis: a comparative study for Topic Modeling

Arxiv

0+阅读 · 2月2日

Culturally Grounded Personas in Large Language Models: Characterization and Alignment with Socio-Psychological Value Frameworks

Arxiv

0+阅读 · 1月29日

Beyond Correctness: Evaluating Subjective Writing Preferences Across Cultures

Arxiv

0+阅读 · 1月28日

Chasing Meaning and/or Insight? A Survey on Evaluation Practices at the Intersection of Visualization and the Humanities

Arxiv

0+阅读 · 1月28日

Large-Scale Multidimensional Knowledge Profiling of Scientific Literature

Arxiv

0+阅读 · 1月21日

Archives, archival bond, and digital representation: A case study with the International Image Interoperability Framework

Arxiv

0+阅读 · 1月21日

XCR-Bench: A Multi-Task Benchmark for Evaluating Cultural Reasoning in LLMs

Arxiv

0+阅读 · 1月20日

A Survey on Knowledge Graphs: Representation, Acquisition and Applications

Arxiv

93+阅读 · 2020年2月2日

相关基金

汉英篇章衔接对齐资源构建与分析研究

国家自然科学基金

2+阅读 · 2015年12月31日

面向异构数据库的查询语言设计及其基础理论研究

国家自然科学基金

1+阅读 · 2015年12月31日

共现潜在语义向量空间模型及其语义核的构建与应用研究

国家自然科学基金

1+阅读 · 2015年12月31日

中文社交化短文本情感分析与话题挖掘研究

国家自然科学基金

3+阅读 · 2015年12月31日

面向大数据的群体偏好决策分析研究

国家自然科学基金

6+阅读 · 2014年12月31日

面向事件分析的信息意图检测、建模与群体意图推理技术研究

国家自然科学基金

12+阅读 · 2014年12月31日

上市公司文本信息分析研究：基于大数据的视角

国家自然科学基金

8+阅读 · 2014年12月31日

特征-知识融合的考古遗址时空重建与分析方法研究

国家自然科学基金

1+阅读 · 2014年12月31日

基于领域知识和链路预测的个性化推荐研究

国家自然科学基金

4+阅读 · 2014年12月31日

面向词汇功能的学术文本语义识别与知识图谱构建

国家自然科学基金

5+阅读 · 2014年12月31日

微信扫码咨询专知VIP会员