We study coalition structure generation (CSG) when coalition values are not given but must be learned from episodic observations. We model each episode as a sparse linear regression problem, where the realised payoff \(Y_t\) is a noisy linear combination of a small number of coalition contributions. This yields a probabilistic CSG framework in which the planner first estimates a sparse value function from \(T\) episodes, then runs a CSG solver on the inferred coalition set. We analyse two estimation schemes. The first, Bayesian Greedy Coalition Pursuit (BGCP), is a greedy procedure that mimics orthogonal matching pursuit. Under a coherence condition and a minimum signal assumption, BGCP recovers the true set of profitable coalitions with high probability once \(T \gtrsim K \log m\), and hence yields welfare-optimal structures. The second scheme uses an \(\ell_1\)-penalised estimator; under a restricted eigenvalue condition, we derive \(\ell_1\) and prediction error bounds and translate them into welfare gap guarantees. We compare both methods to probabilistic baselines and identify regimes where sparse probabilistic CSG is superior, as well as dense regimes where classical least-squares approaches are competitive.
翻译:本文研究联盟价值未知、需通过片段观测进行学习的联盟结构生成问题。我们将每个片段建模为稀疏线性回归问题,其中已实现收益\(Y_t\)是少量联盟贡献的带噪声线性组合。这构建了一个概率化CSG框架:规划者首先从\(T\)个片段中估计稀疏价值函数,随后在推断的联盟集合上运行CSG求解器。我们分析了两种估计方案:第一种是贝叶斯贪婪联盟追踪算法,这是一种模拟正交匹配追踪的贪婪过程。在相干性条件与最小信号假设下,当\(T \gtrsim K \log m\)时,BGCP能以高概率恢复真实盈利联盟集合,从而产生福利最优结构。第二种方案采用$\ell_1$惩罚估计器;在受限特征值条件下,我们推导出$\ell_1$误差界与预测误差界,并将其转化为福利差距的理论保证。我们将两种方法与概率基线进行比较,明确了稀疏概率CSG具有优势的机制,以及经典最小二乘方法仍具竞争力的稠密机制。