A Wasserstein index of dependence for random measures

Optimal transport and Wasserstein distances are flourishing in many scientific fields as a means for comparing and connecting random structures. Here we pioneer the use of an optimal transport distance between L\'{e}vy measures to solve a statistical problem. Dependent Bayesian nonparametric models provide flexible inference on distinct, yet related, groups of observations. Each component of a vector of random measures models a group of exchangeable observations, while their dependence regulates the borrowing of information across groups. We derive the first statistical index of dependence in $[0,1]$ for (completely) random measures that accounts for their whole infinite-dimensional distribution, which is assumed to be equal across different groups. This is accomplished by using the geometric properties of the Wasserstein distance to solve a max-min problem at the level of the underlying L\'{e}vy measures. The Wasserstein index of dependence sheds light on the models' deep structure and has desirable properties: (i) it is $0$ if and only if the random measures are independent; (ii) it is $1$ if and only if the random measures are completely dependent; (iii) it simultaneously quantifies the dependence of $d \ge 2$ random measures, avoiding the need for pairwise comparisons; (iv) it can be evaluated numerically. Moreover, the index allows for informed prior specifications and fair model comparisons for Bayesian nonparametric models.

翻译：最优传输与Wasserstein距离作为比较与连接随机结构的工具，在众多科学领域蓬勃发展。本文开创性地利用Lévy测度之间的最优传输距离来解决统计问题。相依贝叶斯非参数模型为不同但相关的观测组提供了灵活的推断。随机测度向量中的每个分量对一组可交换观测进行建模，而它们之间的相依关系调控着组间信息共享。我们针对（完全）随机测度推导出首个取值于[0,1]的统计相依指数，该指数能够刻画其整个无穷维分布（假定不同组间分布相同）。这一成果通过利用Wasserstein距离的几何性质，在底层Lévy测度层面求解极大极小问题而实现。该Wasserstein相依指数揭示了模型的深层结构，并具有理想性质：（i）当且仅当随机测度相互独立时取值为0；（ii）当且仅当随机测度完全相依时取值为1；（iii）可同时量化d≥2个随机测度的相依性，无需进行两两比较；（iv）可进行数值评估。此外，该指数支持贝叶斯非参数模型的信息化先验设定与公平模型比较。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

【ACL2020】多模态信息抽取，365页ppt

专知会员服务

151+阅读 · 2020年7月6日

【AI应用】Facebook-利用神经网络求解高等数学方程, Using neural networks to solve advanced mathematics equations

专知会员服务

34+阅读 · 2020年1月15日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

35+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日