The one-hot vector has long been widely used in machine learning as a simple and generic method for representing discrete data. However, this method increases the number of dimensions linearly with the categorical data to be represented, which is problematic from the viewpoint of spatial computational complexity in deep learning, which requires a large amount of data. Recently, Analog Bits, a method for representing discrete data as a sequence of bits, was proposed on the basis of the high expressiveness of diffusion models. However, since the number of category types to be represented in a generation task is not necessarily at a power of two, there is a discrepancy between the range that Analog Bits can represent and the range represented as category data. If such a value is generated, the problem is that the original category value cannot be restored. To address this issue, we propose Residual Bit Vector (ResBit), which is a hierarchical bit representation. Although it is a general-purpose representation method, in this paper, we treat it as numerical data and show that it can be used as an extension of Analog Bits using Table Residual Bit Diffusion (TRBD), which is incorporated into TabDDPM, a tabular data generation method. We experimentally confirmed that TRBD can generate diverse and high-quality data from small-scale table data to table data containing diverse category values faster than TabDDPM. Furthermore, we show that ResBit can also serve as an alternative to the one-hot vector by utilizing ResBit for conditioning in GANs and as a label expression in image classification.
翻译:摘要:独热向量作为一种简单通用的离散数据表示方法,长期以来在机器学习中广泛应用。然而,该方法会随着待表示的分类数据量线性增加维度,这在大规模数据驱动的深度学习场景中会引发空间计算复杂性问题。近期,基于扩散模型的高表达能力,研究者提出了一种名为"模拟位"(Analog Bits)的方法,可将离散数据表示为位序列。但由于生成任务中待表示的类别类型数量未必是2的幂次方,模拟位的可表示范围与分类数据实际表示范围之间存在偏差。若生成此类数值,将导致无法还原原始类别值的问题。针对这一挑战,我们提出层级化位表示方法——残差位向量(ResBit)。尽管属于通用表示方法,本文将其作为数值数据处理,并通过融入表格数据生成方法TabDDPM的"表格残差位扩散"(TRBD)框架,展示其作为模拟位扩展的可行性。实验证实,相较于TabDDPM,TRBD能从包含多样化类别值的小规模表格数据中生成更快速、多样且高质量的数据。此外,我们证明ResBit还可作为独热向量的替代方案,应用于GAN的条件控制与图像分类的标签表示场景。