We investigate the behavior of methods that use linear projections to remove information about a concept from a language representation, and we consider the question of what happens to a dataset transformed by such a method. A theoretical analysis and experiments on real-world and synthetic data show that these methods inject strong statistical dependencies into the transformed datasets. After applying such a method, the representation space is highly structured: in the transformed space, an instance tends to be located near instances of the opposite label. As a consequence, the original labeling can in some cases be reconstructed by applying an anti-clustering method.
翻译:我们研究了使用线性投影从语言表示中移除概念信息的方法的行为,并探讨了经过此类方法转换后的数据集会发生什么。理论分析以及在真实数据和合成数据上的实验表明,这些方法会在转换后的数据集中注入强烈的统计依赖性。应用此类方法后,表示空间呈现出高度结构化:在转换空间中,某个实例往往位于相反标签的实例附近。因此,在某些情况下,可以通过应用反聚类方法重构原始标签。