Building on work by Alfonseca et al. (2021), we study the conditions necessary for it to be logically possible to prove that an arbitrary artificially intelligent machine will exhibit certain behavior. To do this, we develop a formalism like -- but mathematically distinct from -- the theory of formal languages and their properties. Our formalism affords a precise means for not only talking about the traits we desire of machines (such as them being intelligent, contained, moral, and so forth), but also for detailing the conditions necessary for it to be logically possible to decide whether a given arbitrary machine possesses such a trait or not. Contrary to Alfonseca et al.'s (2021) results, we find that Rice's theorem from computability theory cannot in general be used to determine whether an arbitrary machine possesses a given trait or not. Therefore, it is not necessarily the case that deciding whether an arbitrary machine is intelligent, contained, moral, and so forth is logically impossible.
翻译:基于Alfonseca等人(2021)的研究,我们探讨了从逻辑上证明任意人工智能机器是否表现出特定行为所必需的条件。为此,我们发展了一种形式化框架——尽管与形式语言理论及其性质在数学上截然不同。该框架不仅为我们提供了精确描述机器所需特质(如智能性、可控性、道德性等)的手段,还详细阐述了从逻辑上判定任意给定机器是否具备某种特质所需的条件。与Alfonseca等人(2021)的研究结果相反,我们发现可计算理论中的Rice定理通常不能用于判定任意机器是否具备某种特质。因此,判定任意机器是否具备智能性、可控性、道德性等特质并非必然在逻辑上不可行。