While denoising diffusion and flow matching have driven major advances in generative modeling, their application to tabular data remains limited, despite its ubiquity in real-world applications. To this end, we develop TabbyFlow, a variational Flow Matching (VFM) method for tabular data generation. To apply VFM to data with mixed continuous and discrete features, we introduce Exponential Family Variational Flow Matching (EF-VFM), which represents heterogeneous data types using a general exponential family distribution. We hereby obtain an efficient, data-driven objective based on moment matching, enabling principled learning of probability paths over mixed continuous and discrete variables. We also establish a connection between variational flow matching and generalized flow matching objectives based on Bregman divergences. Evaluation on tabular data benchmarks demonstrates state-of-the-art performance compared to baselines.
翻译:尽管去噪扩散与流匹配技术已在生成建模领域取得重大进展,但其在表格数据生成中的应用仍然有限,尽管此类数据在现实应用中无处不在。为此,我们开发了TabbyFlow,一种用于表格数据生成的变分流匹配方法。为了将VFM应用于包含连续与离散特征的混合数据,我们提出了指数族变分流匹配,该方法利用广义指数族分布来表示异构数据类型。由此,我们获得了一种基于矩匹配的高效、数据驱动的目标函数,从而实现了对混合连续与离散变量概率路径的原则性学习。我们还建立了变分流匹配与基于Bregman散度的广义流匹配目标之间的联系。在表格数据基准测试上的评估表明,相较于基线方法,该方法取得了最先进的性能。