We study revenue-optimal pricing in data markets with rational, budget-constrained buyers. Such a market offers multiple datasets for sale, and buyers aim to improve the accuracy of their prediction tasks by acquiring data bundles. The market's objective is to price datasets to maximize total revenue, considering that buyers with quasi-linear utilities choose their bundles optimally under budget constraints. We allow the buyers to purchase fractions of datasets, and the amount they pay is proportional to the fraction they receive. Although competitive equilibrium gives revenue-optimal pricing in rivalrous markets with quasi-linear buyers, we show that revenue maximization in data markets is APX-hard. Despite the hardness, we design a 2-approximation algorithm when datasets arrive online, and a $(1-1/e)^{-1}$-approximation algorithm for the offline setting.
翻译:我们研究具有理性且预算约束买家的数据市场中的收益最优定价问题。此类市场提供多个数据集待售,买家旨在通过获取数据包来提升其预测任务的准确性。市场的目标是在考虑买家具有拟线性效用且能在预算约束下最优选择数据包的前提下,对数据集进行定价以最大化总收益。我们允许买家购买数据集的分数份额,其支付金额与所获份额成比例。尽管在具有拟线性买家的竞争性市场中,竞争均衡可提供收益最优定价,但我们证明数据市场中的收益最大化问题是APX-hard的。尽管存在难度,我们针对数据集在线到达场景设计了2-近似算法,针对离线场景设计了$(1-1/e)^{-1}$-近似算法。