The description complexity of a model is the length of the shortest formula that defines the model. We study the description complexity of unary structures in first-order logic FO, also drawing links to semantic complexity in the form of entropy. The class of unary structures provides a simple way to represent tabular Boolean data sets as relational structures. We define structures with FO-formulas that are strictly linear in the size of the model as opposed to using the naive quadratic ones, and we use arguments based on formula size games to obtain related lower bounds for description complexity. We also obtain a precise asymptotic result on the expected description complexity of a randomly selected structure. We then give bounds on the relationship between Shannon entropy and description complexity. We extend this relationship also to Boltzmann entropy by establishing an asymptotic match between the two entropies. Despite the simplicity of unary structures, our arguments require the use of formula size games, Stirling's approximation and Chernoff bounds.
翻译:模型的描述复杂性是定义该模型的最短公式长度。我们研究一元结构在一阶逻辑FO中的描述复杂性,并探讨其与熵这一语义复杂性形式的关联。一元结构类提供了一种将表格化布尔数据集表示为关系结构的简明方式。我们定义了可由FO公式严格线性于模型规模(而非使用朴素的二次公式)描述的结构,并基于公式规模博弈论证获得了描述复杂性的相关下界。我们还得到了随机选取结构的期望描述复杂性的精确渐近结果。随后,我们给出了香农熵与描述复杂性之间关系的界。通过建立两种熵的渐近匹配,我们将这一关系进一步拓展至玻尔兹曼熵。尽管一元结构形式简单,但我们的论证需要运用公式规模博弈、斯特林近似和切尔诺夫界。