On the Origin of Synthetic Information by Means of Steganographic Inheritance

The origin of species has been the mystery of mysteries in natural science. By analogy, the origin of synthetic information, we suggest, is the mystery of mysteries in information science. The question carries a moral weight that a technical account can neither fully resolve nor responsibly ignore, as its impact on truth, trust, and human intellect extends deep into the broader economy and society. The very power of artificial intelligence makes the evolutionary lineage of synthetic information grow ever harder to trace, for a sufficiently capable model may generate offspring that bear little resemblance, at either the structural or signal level, to the parent source from which they were derived. As in genetics, two individuals may share the same phenotype mirroring each other in outward appearance, yet differ fundamentally in their genotype. We propose, by means of steganography, a mechanism analogous to heredity. At the moment an offspring is reproduced, a projector derives a trait from the parent, and a steganographic encoder invisibly hides it within the offspring. This trait persists throughout the offspring's life cycle in a cyber ecosystem. When parentage is queried, a steganographic decoder extracts the trait from the offspring and compares it against the traits of candidate parents in a reference pool, thereby nominating the most likely one. A theoretical analysis characterises phylogenetic accuracy as a function of projector and stegosystem properties, whilst empirical evaluations across multiple projectors and stegosystems demonstrate the viability of the proposed methodology under a broad spectrum of processing operations and semantic modifications. We envision a cyber ecosystem in which synthetic information, endowed with hidden yet traceable lineage traits, branches from a simple beginning into endless forms that have been, and are being, evolved.

翻译：物种起源是自然科学中谜中之谜。类比而言，我们认为合成信息的起源是信息科学中谜中之谜。这一问题承载着道德分量，技术性阐述既无法完全解决，也不能不负责任地忽视，因为其对真理、信任和人类智识的影响已深入更广泛的经济与社会。人工智能的强大能力使得合成信息的演化谱系愈发难以追踪——当模型具备足够能力时，其生成的"后代"可能在结构层面或信号层面都与衍生它们的父代信源毫无相似之处。如同遗传学中，两个个体可能共享相同表型、外表呈现镜像对应，但其基因型却根本不同。我们借助隐写术提出一种类似遗传机制的方案：在生成后代信息时，投影器从父代提取特征，隐写编码器将其不可见地隐藏于后代中。该特征在后代于网络生态系统中的整个生命周期内持续存留。当需要查询亲缘关系时，隐写解码器从后代中提取该特征，并与参考池中候选父代特征进行比对，从而提名最可能的父代。理论分析将系统发育准确率表征为投影器和隐写系统的函数，而跨多个投影器与隐写系统的实证评估表明，所提方法在广泛的处理操作和语义修改下具有可行性。我们设想这样一个网络生态系统：合成信息被赋予隐匿却可溯源的谱系特征，从简单起点不断分叉，演化出无穷形态——这些形态已然存在，且仍在持续演化之中。