Digital libraries in the scientific domain provide users access to a wide range of information to satisfy their diverse information needs. Here, ranking results play a crucial role in users' satisfaction. Exploiting bibliometric metadata, e.g., publications' citation counts or bibliometric indicators in general, for automatically identifying the most relevant results can boost retrieval performance. This work proposes bibliometric data fusion, which enriches existing systems' results by incorporating bibliometric metadata such as citations or altmetrics. Our results on three biomedical retrieval benchmarks from TREC Precision Medicine (TREC-PM) show that bibliometric data fusion is a promising approach to improve retrieval performance in terms of normalized Discounted Cumulated Gain (nDCG) and Average Precision (AP), at the cost of the Precision at 10 (P@10) rate. Patient users especially profit from this lightweight, data-sparse technique that applies to any digital library.
翻译:科学领域的数字图书馆为用户提供广泛的信息访问,以满足其多样化的信息需求。其中,排序结果对用户满意度起着关键作用。利用文献计量元数据(例如,出版物的引用次数或广义上的文献计量指标)来自动识别最相关的结果,可以提升检索性能。本文提出了文献计量数据融合方法,通过整合引用次数或替代计量指标等文献计量元数据来丰富现有系统的检索结果。我们在TREC精准医学(TREC-PM)的三个生物医学检索基准上的结果表明,文献计量数据融合是一种有前景的方法,在归一化折损累计增益(nDCG)和平均精度(AP)方面能改善检索性能,但代价是前10名精确度(P@10)的下降。患者用户尤其受益于这种轻量级、数据稀疏的技术,该技术可适用于任何数字图书馆。