In this work, we analyze alternative effective sample size (ESS) metrics for importance sampling algorithms, and discuss a possible extended range of applications. We show the relationship between the ESS expressions used in the literature and two entropy families, the Rényi and Tsallis entropy. The Rényi entropy is connected to the Huggins-Roy's ESS family introduced in \cite{Huggins15}. We prove that that all the ESS functions included in the Huggins-Roy's family fulfill all the desirable theoretical conditions. We analyzed and remark the connections with several other fields, such as the Hill numbers introduced in ecology, the Gini inequality coefficient employed in economics, and the Gini impurity index used mainly in machine learning, to name a few. Finally, by numerical simulations, we study the performance of different ESS expressions contained in the previous ESS families in terms of approximation of the theoretical ESS definition, and show the application of ESS formulas in a variable selection problem.
翻译:本文分析了重要性采样算法中替代性有效样本量(ESS)度量,并探讨了其可能的扩展应用范围。我们证明了文献中使用的ESS表达式与两种熵族——Rényi熵和Tsallis熵之间的关联。Rényi熵与\cite{Huggins15}提出的Huggins-Roy ESS族存在联系。我们证明了Huggins-Roy族包含的所有ESS函数均满足理想的理论条件。通过分析指出其与多个领域的联系,例如生态学中提出的Hill数、经济学中采用的基尼不平等系数,以及机器学习中主要使用的基尼不纯度指数等。最后通过数值模拟,研究了前述ESS族中不同ESS表达式在理论ESS定义近似方面的性能表现,并展示了ESS公式在变量选择问题中的应用。