Dimension reduction is a technique used to transform data from a high-dimensional space into a lower-dimensional space, aiming to retain as much of the original information as possible. This approach is crucial in many disciplines like engineering, biology, astronomy, and economics. In this paper, we consider the following dimensionality reduction instance: Given an n-dimensional probability distribution p and an integer m<n, we aim to find the m-dimensional probability distribution q that is the closest to p, using the Kullback-Leibler divergence as the measure of closeness. We prove that the problem is strongly NP-hard, and we present an approximation algorithm for it.
翻译:降维是一种将数据从高维空间转换到低维空间的技术,旨在尽可能保留原始信息。这种方法在工程、生物学、天文学和经济学等众多学科中至关重要。本文考虑以下降维实例:给定一个n维概率分布p和一个整数m<n,我们旨在使用Kullback-Leibler散度作为接近度度量,找到最接近p的m维概率分布q。我们证明了该问题是强NP难的,并为此提出了一种近似算法。