This paper considers the sample-efficiency of preference learning, which models and predicts human choices based on comparative judgments. The minimax optimal estimation rate $\Theta(d/n)$ in traditional estimation theory requires that the number of samples $n$ scales linearly with the dimensionality of the feature space $d$. However, the high dimensionality of the feature space and the high cost of collecting human-annotated data challenge the efficiency of traditional estimation methods. To remedy this, we leverage sparsity in the preference model and establish sharp estimation rates. We show that under the sparse random utility model, where the parameter of the reward function is $k$-sparse, the minimax optimal rate can be reduced to $\Theta(k/n \log(d/k))$. Furthermore, we analyze the $\ell_{1}$-regularized estimator and show that it achieves near-optimal rate under mild assumptions on the Gram matrix. Experiments on synthetic data and LLM alignment data validate our theoretical findings, showing that sparsity-aware methods significantly reduce sample complexity and improve prediction accuracy.
翻译:本文研究偏好学习的样本效率问题,该领域基于比较判断对人类选择行为进行建模与预测。传统估计理论中的极小极大最优估计速率$\Theta(d/n)$要求样本数量$n$与特征空间维度$d$呈线性增长关系。然而,特征空间的高维特性与人工标注数据的高采集成本,对传统估计方法的效率提出了严峻挑战。为应对此问题,我们利用偏好模型中的稀疏性结构建立了精确的估计速率。研究表明,在奖励函数参数为$k$稀疏的稀疏随机效用模型下,极小极大最优速率可降低至$\Theta(k/n \log(d/k))$。进一步,我们分析了$\ell_{1}$正则化估计器,证明其在格拉姆矩阵的温和假设下能达到近乎最优的收敛速率。通过合成数据与大语言模型对齐数据的实验验证,稀疏感知方法能显著降低样本复杂度并提升预测准确性。