As a representative information retrieval task, site recommendation, which aims at predicting the optimal sites for a brand or an institution to open new branches in an automatic data-driven way, is beneficial and crucial for brand development in modern business. However, there is no publicly available dataset so far and most existing approaches are limited to an extremely small scope of brands, which seriously hinders the research on site recommendation. Therefore, we collect, construct and release an open comprehensive dataset, namely OpenSiteRec, to facilitate and promote the research on site recommendation. Specifically, OpenSiteRec leverages a heterogeneous graph schema to represent various types of real-world entities and relations in four international metropolises. To evaluate the performance of the existing general methods on the site recommendation task, we conduct benchmarking experiments of several representative recommendation models on OpenSiteRec. Furthermore, we also highlight the potential application directions to demonstrate the wide applicability of OpenSiteRec. We believe that our OpenSiteRec dataset is significant and anticipated to encourage the development of advanced methods for site recommendation. OpenSiteRec is available online at https://OpenSiteRec.github.io/.
翻译:作为一项代表性信息检索任务,选址推荐旨在以数据驱动的自动化方式预测品牌或机构开设新分支的最佳位置,对现代商业中的品牌发展至关重要且具有显著效益。然而,目前尚无公开可用的数据集,且现有方法大多局限于极少数品牌范围,严重阻碍了选址推荐领域的研究进展。为此,我们收集、构建并发布了一个名为OpenSiteRec的开源综合性数据集,以促进和推动选址推荐研究。具体而言,OpenSiteRec采用异构图模式,对四个国际大都市中各类真实世界实体及其关系进行表征。为评估现有通用方法在选址推荐任务上的性能,我们选取了若干代表性推荐模型在OpenSiteRec上开展了基准测试实验。此外,我们还重点展示了潜在的应用方向,以证明OpenSiteRec的广泛适用性。我们相信OpenSiteRec数据集具有重要价值,有望激励先进选址推荐方法的发展。OpenSiteRec可通过在线链接https://OpenSiteRec.github.io/获取。