Online bipartite matching is a fundamental problem in online algorithms. The goal is to match two sets of vertices to maximize the sum of the edge weights, where for one set of vertices, each vertex and its corresponding edge weights appear in a sequence. Currently, in the practical recommendation system or search engine, the weights are decided by the inner product between the deep representation of a user and the deep representation of an item. The standard online matching needs to pay $nd$ time to linear scan all the $n$ items, computing weight (assuming each representation vector has length $d$), and then deciding the matching based on the weights. However, in reality, the $n$ could be very large, e.g. in online e-commerce platforms. Thus, improving the time of computing weights is a problem of practical significance. In this work, we provide the theoretical foundation for computing the weights approximately. We show that, with our proposed randomized data structures, the weights can be computed in sublinear time while still preserving the competitive ratio of the matching algorithm.
翻译:在线二分匹配是在线算法中的一个基本问题。其目标是匹配两组顶点以最大化边权重之和,其中一组顶点中的每个顶点及其对应的边权重按顺序出现。当前,在实用的推荐系统或搜索引擎中,权重由用户深度表示与物品深度表示之间的内积决定。标准的在线匹配需要花费$nd$时间线性扫描所有$n$个物品,计算权重(假设每个表示向量的长度为$d$),然后基于权重决定匹配。然而,在现实中,$n$可能非常大,例如在线电子商务平台中。因此,改进权重计算的时间是一个具有实际意义的问题。本文为近似计算权重提供了理论基础。我们证明,利用我们提出的随机化数据结构,可以在次线性时间内计算权重,同时保持匹配算法的竞争比。