Online Learning for Equilibrium Pricing in Markets under Incomplete Information

The study of market equilibria is central to economic theory, particularly in efficiently allocating scarce resources. However, the computation of equilibrium prices at which the supply of goods matches their demand typically relies on having access to complete information on private attributes of agents, e.g., suppliers' cost functions, which are often unavailable in practice. Motivated by this practical consideration, we consider the problem of setting equilibrium prices in the incomplete information setting wherein a market operator seeks to satisfy the customer demand for a commodity by purchasing the required amount from competing suppliers with privately known cost functions unknown to the market operator. In this incomplete information setting, we consider the online learning problem of learning equilibrium prices over time while jointly optimizing three performance metrics -- unmet demand, cost regret, and payment regret -- pertinent in the context of equilibrium pricing over a horizon of $T$ periods. We first consider the setting when suppliers' cost functions are fixed and develop algorithms that achieve a regret of $O(\log \log T)$ when the customer demand is constant over time, or $O(\sqrt{T} \log \log T)$ when the demand is variable over time. Next, we consider the setting when the suppliers' cost functions can vary over time and illustrate that no online algorithm can achieve sublinear regret on all three metrics when the market operator has no information about how the cost functions change over time. Thus, we consider an augmented setting wherein the operator has access to hints/contexts that, without revealing the complete specification of the cost functions, reflect the variation in the cost functions over time and propose an algorithm with sublinear regret in this augmented setting.

翻译：市场均衡的研究是经济理论的核心，尤其在高效配置稀缺资源方面。然而，计算使商品供给与需求匹配的均衡价格通常依赖于获取代理方的私人属性完整信息（例如供应商的成本函数），而这些信息在实践中往往难以获得。受此实际考量驱动，我们研究了不完全信息环境下的均衡价格设定问题：市场运营商需从具有私人成本函数（且该函数不为运营商所知）的竞争性供应商处购买所需数量商品以满足客户需求。在此不完全信息环境中，我们考虑了在线学习问题——在T时间跨度内，结合均衡定价相关背景，同时优化三个性能指标（未满足需求、成本遗憾与支付遗憾）随时间动态学习均衡价格。首先考虑供应商成本函数固定的场景：当客户需求恒定时，我们开发的算法可实现O(log log T)的遗憾值；当需求随时间变化时，遗憾值为O(√T log log T)。其次，考虑供应商成本函数可能随时间变化的情形：我们证明，若市场运营商对成本函数变化方式完全未知，则任何在线算法都无法使三个指标均达到次线性遗憾。因此，我们研究了增强型场景——运营商可获取暗示/上下文信息（不揭示成本函数完整规格但反映其随时间变化规律），并在该增强场景中提出了一种具有次线性遗憾的算法。