PARWiS: Winner determination under shoestring budgets using active pairwise comparisons

Determining a winner among a set of items using active pairwise comparisons under a limited budget is a challenging problem in preference-based learning. The goal of this study is to implement and evaluate the PARWiS algorithm, which shows spectral ranking and disruptive pair selection to identify the best item under shoestring budgets. This work have extended the PARWiS with a contextual variant (Contextual PARWiS) and a reinforcement learning-based variant (RL PARWiS), comparing them against baselines, including Double Thompson Sampling and a random selection strategy. This evaluation spans synthetic and real-world datasets (Jester and MovieLens), using budgets of 40, 60, and 80 comparisons for 20 items. The performance is measured through recovery fraction, true rank of reported winner, reported rank of true winner, and cumulative regret, alongside the separation metric \(Δ_{1,2}\). Results show that PARWiS and RL PARWiS outperform baselines across all datasets, particularly in the Jester dataset with a higher \(Δ_{1,2}\), while performance gaps narrow in the more challenging MovieLens dataset with a smaller \(Δ_{1,2}\). Contextual PARWiS shows comparable performance to PARWiS, indicating that contextual features may require further tuning to provide significant benefits.

翻译：在有限预算下使用主动成对比较从一组候选项中确定优胜者，是基于偏好的学习中的一个具有挑战性的问题。本研究的目标是实现并评估PARWiS算法，该算法结合谱排序和破坏性配对选择，以在极低预算下识别最佳候选项。本研究扩展了PARWiS，提出了一个上下文感知变体（Contextual PARWiS）和一个基于强化学习的变体（RL PARWiS），并将它们与基线方法（包括Double Thompson Sampling和随机选择策略）进行比较。评估涵盖合成数据集和真实世界数据集（Jester和MovieLens），针对20个候选项分别使用40、60和80次比较的预算。性能通过恢复分数、报告优胜者的真实排名、真实优胜者的报告排名、累积遗憾以及分离度量 \(Δ_{1,2}\) 来衡量。结果表明，PARWiS和RL PARWiS在所有数据集上均优于基线方法，尤其是在具有较高 \(Δ_{1,2}\) 的Jester数据集中；而在具有较小 \(Δ_{1,2}\)、更具挑战性的MovieLens数据集中，性能差距缩小。Contextual PARWiS表现出与PARWiS相当的性能，这表明上下文特征可能需要进一步调优才能提供显著优势。

相关内容

数据集

关注 88

数据集，又称为资料集、数据集合或资料集合，是一种由数据所组成的集合。
Data set（或dataset）是一个数据的集合，通常以表格形式出现。每一列代表一个特定变量。每一行都对应于某一成员的数据集的问题。它列出的价值观为每一个变量，如身高和体重的一个物体或价值的随机数。每个数值被称为数据资料。对应于行数，该数据集的数据可能包括一个或多个成员。

McGill大学等最新《不确定性决策下的上下文优化方法》综述

专知会员服务

33+阅读 · 2023年6月25日