Cooking is a cultural expression of human creativity that transcends geography and time through the orchestration of ingredients and techniques, much like languages do through words and syntax. Yet, beneath the apparent diversity of culinary traditions, whether recipes obey statistical laws comparable to those of other symbolic systems remains unknown. Here we analyze a large corpus of traditional recipes spanning global cuisines, annotated using a state-of-the-art named entity recognition algorithm into ingredients, cooking techniques, utensils, and other culinary attributes. We find that ingredient usage exhibits Zipf-like rank-frequency scaling, that culinary diversity grows sublinearly with corpus size in accordance with Heaps' law, and that recipe complexity follows Menzerath-Altmann-type relations between the number and average information of constituent units. Consistent with observations in packaged foods, macronutrient concentrations across recipes also display a log-normal signature. Minimal generative models based on preferential reuse, constrained sampling, and incremental modification recapitulate these regularities, suggesting generic processes that shape recipe architecture across cultures. Together, these findings establish recipes as a compositional symbolic system in which complex structure emerges from simple, constrained generative processes.
翻译:烹饪是超越时空的人类创造力文化表达,通过食材与技艺的交织实现,正如语言通过词汇与语法达成。然而,在烹饪传统的表面多样性之下,食谱是否遵循与其他符号系统可比的统计规律仍属未知。本研究分析了涵盖全球菜系的大量传统食谱,采用最先进的命名实体识别算法将其标注为食材、烹饪技术、厨具及其他烹饪属性。我们发现食材使用呈现齐普夫式秩-频标度关系,烹饪多样性随语料库规模呈希普斯定律亚线性增长,且食谱复杂度在组成部分数量与平均信息量之间展现门策拉特-阿尔特曼型关系。与包装食品观测结果一致,跨食谱宏量营养素浓度也呈现对数正态特征。基于优先复用、约束采样和增量修正的最小生成模型再现了这些规律,揭示了跨文化塑造食谱结构的通用过程。综上,这些发现确立了食谱作为一种组合符号系统,其复杂结构源于简单、受约束的生成过程。