Statistical Models of Top-$k$ Partial Orders

In many contexts involving ranked preferences, agents submit partial orders over available alternatives. Statistical models often treat these as marginal in the space of total orders, but this approach overlooks information contained in the list length itself. In this work, we introduce and taxonomize approaches for jointly modeling distributions over top-$k$ partial orders and list lengths $k$, considering two classes of approaches: composite models that view a partial order as a truncation of a total order, and augmented ranking models that model the construction of the list as a sequence of choice decisions, including the decision to stop. For composite models, we consider three dependency structures for joint modeling of order and truncation length. For augmented ranking models, we consider different assumptions on how the stop-token choice is modeled. Using data consisting of partial rankings from San Francisco school choice and San Francisco ranked choice elections, we evaluate how well the models predict observed data and generate realistic synthetic datasets. We find that composite models, explicitly modeling length as a categorical variable, produce synthetic datasets with accurate length distributions, and an augmented model with position-dependent item utilities jointly models length and preferences in the training data best, as measured by negative log loss. Methods from this work have significant implications on the simulation and evaluation of real-world social systems that solicit ranked preferences.

翻译：在许多涉及排序偏好的场景中，参与者会提交关于可用选项的偏序关系。统计模型通常将这些偏序视为全序空间中的边际分布，但这种方法忽略了列表长度本身所包含的信息。在本研究中，我们引入并系统分类了联合建模top-$k$偏序分布与列表长度$k$的方法，考虑了两类建模途径：将偏序视为全序截断的复合模型，以及将列表构建建模为一系列选择决策（包括停止决策）的增强排序模型。对于复合模型，我们考虑了三种用于联合建模排序与截断长度的依赖结构。对于增强排序模型，我们考虑了关于停止标记选择建模方式的不同假设。利用来自旧金山学校选择和旧金山排序选择选举的偏序排名数据，我们评估了这些模型在预测观测数据和生成真实合成数据集方面的表现。我们发现，显式将长度建模为类别变量的复合模型能够生成具有准确长度分布的合成数据集；而一个具有位置依赖项效用的增强模型，在负对数损失度量下，能够最好地联合建模训练数据中的长度与偏好。本研究中的方法对于征集排序偏好的现实社会系统的模拟与评估具有重要影响。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

32+阅读 · 2021年9月29日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

35+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日