Many machine learning techniques incorporate identity-preserving transformations into their models to generalize their performance to previously unseen data. These transformations are typically selected from a set of functions that are known to maintain the identity of an input when applied (e.g., rotation, translation, flipping, and scaling). However, there are many natural variations that cannot be labeled for supervision or defined through examination of the data. As suggested by the manifold hypothesis, many of these natural variations live on or near a low-dimensional, nonlinear manifold. Several techniques represent manifold variations through a set of learned Lie group operators that define directions of motion on the manifold. However, these approaches are limited because they require transformation labels when training their models and they lack a method for determining which regions of the manifold are appropriate for applying each specific operator. We address these limitations by introducing a learning strategy that does not require transformation labels and developing a method that learns the local regions where each operator is likely to be used while preserving the identity of inputs. Experiments on MNIST and Fashion MNIST highlight our model's ability to learn identity-preserving transformations on multi-class datasets. Additionally, we train on CelebA to showcase our model's ability to learn semantically meaningful transformations on complex datasets in an unsupervised manner.
翻译:许多机器学习技术将身份保持变换融入其模型中,以推广其性能至先前未见的数据。这些变换通常从一组已知在应用时能保持输入身份的函数中选择(例如旋转、平移、翻转和缩放)。然而,存在许多自然变异无法通过监督标注或数据检验来定义。根据流形假设,这些自然变异中的许多存在于或接近于低维非线性流形上。若干技术通过学习一组定义流形上运动方向的李群算子来表示流形变异。然而,这些方法存在局限性,因为它们在训练模型时需要变换标签,并且缺乏确定流形上哪些区域适合应用每个特定算子的方法。我们通过引入一种无需变换标签的学习策略,并开发一种在保持输入身份的同时学习每个算子可能使用的局部区域的方法,来解决这些局限性。在MNIST和Fashion MNIST上的实验突显了我们的模型在多类数据集上学习身份保持变换的能力。此外,我们在CelebA上进行训练,展示了我们的模型能够以无监督方式在复杂数据集上学习具有语义意义的变换。