We consider the relationship between learnability of a ``base class'' of functions on a set X and learnability of a class of statistical functions derived from the base class. For example, we refine results showing that learnability of a family of functions implies learnability of the family of functions mapping a function in the class to its expectation under a distribution. We will look at both Probably Approximately Correct (PAC) learning, where example inputs and outputs are chosen at random, and online learning, where the examples are chosen adversarially. We establish improved bounds on the sample complexity of learning for statistical classes, stated in terms of combinatorial dimensions of the base class. We do this by adapting techniques introduced in model theory for ``randomizing a structure''. We give particular attention to classes derived from logical formulas, and relate learnability of the statistical classes to properties of the formula. Finally, we provide bounds on the complexity of learning the statistical classes built on top of a logic-based hypothesis class.
翻译:我们探讨了在集合X上"基类"函数的可学习性与其衍生出的统计函数类可学习性之间的关系。例如,我们改进了现有结果,证明函数族的可学习性意味着将该类中函数映射到其在分布下期望值的函数族也具有可学习性。我们将同时考察概率近似正确(PAC)学习(其中输入输出样本随机选取)和在线学习(其中样本由对抗性选择)两种框架。通过改进"随机化结构"的模型论技术,我们建立了统计类学习样本复杂度的优化边界,并以基类的组合维度进行表述。我们特别关注由逻辑公式衍生的函数类,并将统计类的可学习性与公式性质相关联。最后,我们给出了基于逻辑假设类构建的统计类学习复杂度的边界。