The interest in summarizing complex and multidimensional phenomena often related to one or more specific sectors (social, economic, environmental, political, etc.) to make them easily understandable even to non-experts is far from waning. A widely adopted approach for this purpose is the use of composite indices, statistical measures that aggregate multiple indicators into a single comprehensive measure. In this paper, we present a novel methodology called AutoSynth, designed to condense potentially extensive datasets into a single synthetic index or a hierarchy of such indices. AutoSynth leverages an Autoencoder, a neural network technique, to represent a matrix of features in a lower-dimensional space. Although this approach is not limited to the creation of a particular composite index and can be applied broadly across various sectors, the motivation behind this work arises from a real-world need. Specifically, we aim to assess the vulnerability of the Italian city of Florence at the suburban level across three dimensions: economic, demographic, and social. To demonstrate the methodology's effectiveness, it is also applied to estimate a vulnerability index using a rich, publicly available dataset on U.S. counties and validated through a simulation study.
翻译:对于将复杂的多维现象(通常涉及社会、经济、环境、政治等一个或多个特定领域)进行概括,使其即使对非专业人士也易于理解,这一需求的关注度从未消退。为此,一个广泛采用的方法是使用合成指数,即通过统计度量将多个指标聚合为单一综合指标。在本文中,我们提出了一种名为AutoSynth的新方法,旨在将潜在的大规模数据集压缩为单一合成指数或指数层次结构。AutoSynth利用自编码器(一种神经网络技术)将特征矩阵表示为低维空间中的形式。尽管该方法不局限于创建特定合成指数,并可广泛应用于各领域,但本文的研究动机源于实际需求。具体而言,我们旨在从经济、人口和社会三个维度评估意大利佛罗伦萨市郊区层面的脆弱性。为证明该方法的有效性,我们还将其应用于一个包含美国各县丰富公开数据集的脆弱性指数估算,并通过模拟研究进行验证。