Language models and specialized table embedding models have recently demonstrated strong performance on many tasks over tabular data. Researchers and practitioners are keen to leverage these models in many new application contexts; but limited understanding of the strengths and weaknesses of these models, and the table representations they generate, makes the process of finding a suitable model for a given task reliant on trial and error. There is an urgent need to gain a comprehensive understanding of these models to minimize inefficiency and failures in downstream usage. To address this need, we propose Observatory, a formal framework to systematically analyze embedding representations of relational tables. Motivated both by invariants of the relational data model and by statistical considerations regarding data distributions, we define eight primitive properties, and corresponding measures to quantitatively characterize table embeddings for these properties. Based on these properties, we define an extensible framework to evaluate language and table embedding models. We collect and synthesize a suite of datasets and use Observatory to analyze nine such models. Our analysis provides insights into the strengths and weaknesses of learned representations over tables. We find, for example, that some models are sensitive to table structure such as column order, that functional dependencies are rarely reflected in embeddings, and that specialized table embedding models have relatively lower sample fidelity. Such insights help researchers and practitioners better anticipate model behaviors and select appropriate models for their downstream tasks, while guiding researchers in the development of new models.
翻译:语言模型与专用表格嵌入模型近期在表格数据上的多项任务中展现出强大性能。研究者和实践者迫切希望将这些模型应用于众多新场景,但由于对这些模型及其生成的表格表征的优缺点认知有限,为特定任务寻找合适模型的过程仍依赖反复试验。为了最大程度减少下游应用中因效率低下和模型失效带来的问题,亟需全面理解这些模型。为此,我们提出Observatory——一个系统性分析关系表嵌入表征的形式化框架。受关系数据模型的不变量特性和数据分布统计考量的双重启发,我们定义了八类基本属性及相应的定量指标,用以刻画表格嵌入的上述特性。基于这些属性,我们构建了可扩展的评估框架用于分析语言模型与表格嵌入模型。我们收集并整合了一套基准数据集,借助Observatory对九种模型展开分析。研究揭示了表格学习表征的优缺点:例如,部分模型对表格结构(如列顺序)敏感,函数依赖关系很少在嵌入中体现,以及专用表格嵌入模型的样本保真度相对较低。这些发现有助于研究者和实践者更好地预判模型行为、为下游任务选择适配模型,同时为新型模型的研发提供方向。