On Manifold Learning in Plato's Cave: Remarks on Manifold Learning and Physical Phenomena

Many techniques in machine learning attempt explicitly or implicitly to infer a low-dimensional manifold structure of an underlying physical phenomenon from measurements without an explicit model of the phenomenon or the measurement apparatus. This paper presents a cautionary tale regarding the discrepancy between the geometry of measurements and the geometry of the underlying phenomenon in a benign setting. The deformation in the metric illustrated in this paper is mathematically straightforward and unavoidable in the general case, and it is only one of several similar effects. While this is not always problematic, we provide an example of an arguably standard and harmless data processing procedure where this effect leads to an incorrect answer to a seemingly simple question. Although we focus on manifold learning, these issues apply broadly to dimensionality reduction and unsupervised learning.

翻译：许多机器学习技术试图在缺乏现象或测量设备的显式模型情况下，从测量数据中显式或隐式地推断潜在物理现象的低维流形结构。本文提供了一个警示性案例，说明在良性设定下测量几何与潜在现象几何之间的差异。本文阐述的度量变形在数学上具有直接性，且在一般情况下不可避免，而这仅是若干类似效应之一。尽管这并非总会引发问题，但我们提供了一个看似标准且无害的数据处理流程实例，其中该效应导致对一个看似简单的问题给出了错误答案。虽然我们聚焦于流形学习，但这些问题广泛适用于降维和无监督学习。

相关内容

流形学习

关注 345

流形学习，全称流形学习方法(Manifold Learning)，自2000年在著名的科学杂志《Science》被首次提出以来，已成为信息科学领域的研究热点。在理论和应用上，流形学习方法都具有重要的研究意义。假设数据是均匀采样于一个高维欧氏空间中的低维流形，流形学习就是从高维采样数据中恢复低维流形结构，即找到高维空间中的低维流形，并求出相应的嵌入映射，以实现维数约简或者数据可视化。它是从观测到的现象中去寻找事物的本质，找到产生数据的内在规律。

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

32+阅读 · 2021年9月29日

【亚马逊-WWW2020】不解析,生成!用于面向任务的语义分析的序列到序列体系结构，Don't Parse, Generate! A Sequence to Sequence Architecture for Task-Oriented Semantic Parsing

专知会员服务

15+阅读 · 2020年2月1日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日