离线学习可能重叠联盟的纳什稳定联盟结构 (Offline Learning of Nash Stable Coalition Structures with Possibly Overlapping Coalitions)

Coalition formation concerns strategic collaborations of selfish agents that form coalitions based on their preferences. It is often assumed that coalitions are disjoint and preferences are fully known, which may not hold in practice. In this paper, we thus present a new model of coalition formation with possibly overlapping coalitions under partial information, where selfish agents may be part of multiple coalitions simultaneously and their full preferences are initially unknown. Instead, information about past interactions and associated utility feedback is stored in a fixed offline dataset, and we aim to efficiently infer the agents' preferences from this dataset. We analyze the impact of diverse dataset information constraints by studying two types of utility feedback that can be stored in the dataset: agent- and coalition-level utility feedback. For both feedback models, we identify assumptions under which the dataset covers sufficient information for an offline learning algorithm to infer preferences and use them to recover a partition that is (approximately) Nash stable, in which no agent can improve her utility by unilaterally deviating. Our additional goal is devising algorithms with low sample complexity, requiring only a small dataset to obtain a desired approximation to Nash stability. Under agent-level feedback, we provide a sample-efficient algorithm proven to obtain an approximately Nash stable partition under a sufficient and necessary assumption on the information covered by the dataset. However, under coalition-level feedback, we show that only under a stricter assumption is sufficient for sample-efficient learning. Still, in multiple cases, our algorithms' sample complexity bounds have optimality guarantees up to logarithmic factors. Finally, extensive experiments show that our algorithm converges to a low approximation level to Nash stability across diverse settings.

翻译：联盟形成关注自私智能体基于其偏好形成联盟的战略协作。通常假设联盟互不相交且偏好完全已知，但这在实践中可能不成立。因此，本文提出了一种在部分信息下可能重叠联盟形成的新模型，其中自私智能体可能同时属于多个联盟，且其完整偏好初始未知。相反，关于历史交互及关联效用反馈的信息存储于固定的离线数据集中，我们的目标是从该数据集高效推断智能体的偏好。通过研究可存储于数据集中的两种效用反馈类型——智能体层面与联盟层面的效用反馈，我们分析了多样化数据集信息约束的影响。针对两种反馈模型，我们识别了数据集覆盖足够信息的假设条件，使得离线学习算法能够推断偏好并利用其恢复（近似）纳什稳定的划分，即在此划分中无智能体可通过单方面偏离来提高自身效用。我们的额外目标是设计具有低样本复杂度的算法，仅需小型数据集即可获得对纳什稳定性的期望近似。在智能体层面反馈下，我们提供了一种样本高效算法，证明在数据集所覆盖信息的充分必要假设下可获得近似纳什稳定划分。然而，在联盟层面反馈下，我们表明仅当更严格的假设成立时，样本高效学习才可行。尽管如此，在多种情况下，我们算法的样本复杂度界限具有直至对数因子的最优性保证。最后，大量实验表明我们的算法在不同设置下均能收敛至纳什稳定性的低近似水平。

相关内容

数据集

关注 88

数据集，又称为资料集、数据集合或资料集合，是一种由数据所组成的集合。
Data set（或dataset）是一个数据的集合，通常以表格形式出现。每一列代表一个特定变量。每一行都对应于某一成员的数据集的问题。它列出的价值观为每一个变量，如身高和体重的一个物体或价值的随机数。每个数值被称为数据资料。对应于行数，该数据集的数据可能包括一个或多个成员。

《多智能体系统中的异质性》221页

专知会员服务

35+阅读 · 2025年2月14日

《监督和非监督机器学习策略下的军事联盟建模》225页

专知会员服务

25+阅读 · 2025年2月12日

《基于监督和非监督机器学习策略的军事联盟建模》225页

专知会员服务

22+阅读 · 2024年10月27日

《针对高性能协作跟踪和编队目标的分布式迭代学习控制》169页论文

专知会员服务

28+阅读 · 2024年3月23日