Optimal Communication Complexity of Chained Index

We study the CHAIN communication problem introduced by Cormode et al. [ICALP 2019]. It is a generalization of the well-studied INDEX problem. For $k\geq 1$, in CHAIN$_{n,k}$, there are $k$ instances of INDEX, all with the same answer. They are shared between $k+1$ players as follows. Player 1 has the first string $X^1 \in \{0,1\}^n$, player 2 has the first index $\sigma^1 \in [n]$ and the second string $X^2 \in \{0,1\}^n$, player 3 has the second index $\sigma^2 \in [n]$ along with the third string $X^3 \in \{0,1\}^n$, and so on. Player $k+1$ has the last index $\sigma^k \in [n]$. The communication is one way from each player to the next, starting from player 1 to player 2, then from player 2 to player 3 and so on. Player $k+1$, after receiving the message from player $k$, has to output a single bit which is the answer to all $k$ instances of INDEX. It was proved that the CHAIN$_{n,k}$ problem requires $\Omega(n/k^2)$ communication by Cormode et al., and they used it to prove streaming lower bounds for approximation of maximum independent sets. Subsequently, it was used by Feldman et al. [STOC 2020] to prove lower bounds for streaming submodular maximization. However, these works do not get optimal bounds on the communication complexity of CHAIN$_{n,k}$, and in fact, it was conjectured by Cormode et al. that $\Omega(n)$ bits are necessary, for any $k$. As our main result, we prove the optimal lower bound of $\Omega(n)$ for CHAIN$_{n,k}$. This settles the open conjecture of Cormode et al. in the affirmative. The key technique is to use information theoretic tools to analyze protocols over the Jensen-Shannon divergence measure, as opposed to total variation distance. As a corollary, we get an improved lower bound for approximation of maximum independent set in vertex arrival streams through a reduction from CHAIN directly.

翻译：我们研究了由Cormode等人在[ICALP 2019]中提出的CHAIN通信问题。该问题是广受研究的INDEX问题的推广。对于$k\geq 1$，在CHAIN$_{n,k}$问题中，存在$k$个具有相同答案的INDEX问题实例。它们被分配给$k+1$名参与者，分配方式如下：参与者1持有第一个字符串$X^1 \in \{0,1\}^n$，参与者2持有第一个索引$\sigma^1 \in [n]$和第二个字符串$X^2 \in \{0,1\}^n$，参与者3持有第二个索引$\sigma^2 \in [n]$以及第三个字符串$X^3 \in \{0,1\}^n$，依此类推。参与者$k+1$持有最后一个索引$\sigma^k \in [n]$。通信按参与者顺序单向进行：从参与者1到参与者2，然后从参与者2到参与者3，以此类推。参与者$k+1$在收到参与者$k$的消息后，需要输出一个比特位作为所有$k$个INDEX实例的答案。Cormode等人证明了CHAIN$_{n,k}$问题需要$\Omega(n/k^2)$的通信复杂度，并借此证明了最大独立集近似问题的流式下界。随后，Feldman等人在[STOC 2020]中利用该结果证明了流式次模最大化问题的下界。然而，这些工作并未得到CHAIN$_{n,k}$通信复杂度的最优界，事实上Cormode等人曾猜想对于任意$k$都需要$\Omega(n)$比特的通信量。作为主要研究成果，我们证明了CHAIN$_{n,k}$的$\Omega(n)$最优下界，从而肯定了Cormode等人的公开猜想。关键技术在于运用信息论工具，基于Jensen-Shannon散度度量分析通信协议，而非采用全变差距离。作为推论，通过从CHAIN问题直接规约，我们得到了顶点到达流中最大独立集近似问题的改进下界。