While classical scaling, just like principal component analysis, is parameter-free, other methods for embedding multivariate data require the selection of one or several tuning parameters. This tuning can be difficult due to the unsupervised nature of the situation. We propose a simple, almost obvious, approach to supervise the choice of tuning parameter(s): minimize a notion of stress. We apply this approach to the selection of the patch size in a prototypical patch-stitching embedding method, both in the multidimensional scaling (aka network localization) setting and in the dimensionality reduction (aka manifold learning) setting. In our study, we uncover a new bias--variance tradeoff phenomenon.
翻译:经典缩放方法与主成分分析一样无需参数,但其他用于嵌入多元数据的方法则需要选择一个或多个调优参数。由于这一过程的非监督性质,参数调优往往具有挑战性。我们提出一种简单且近乎直观的方法来监督调优参数的选择:最小化某种应力度量。我们将该方法应用于典型补丁拼接嵌入方法中补丁大小的选择问题,涵盖多维尺度分析(亦称网络定位)和降维分析(亦称流形学习)两种场景。研究中我们发现了一种新的偏差-方差权衡现象。