Prevailing methods for assessing and comparing generative AIs incentivize responses that serve a hypothetical representative individual. Evaluating models in these terms presumes homogeneous preferences across the population and engenders selection of agglomerative AIs, which fail to represent the diverse range of interests across individuals. We propose an alternative evaluation method that instead prioritizes inclusive AIs, which provably retain the requisite knowledge not only for subsequent response customization to particular segments of the population but also for utility-maximizing decisions.
翻译:当前评估和比较生成式AI的主流方法,会激励模型生成服务于假设性代表个体的响应。从这些维度评估模型预设了人群具有同质化偏好,并导致选择聚合式AI系统——这类系统无法代表个体间多样化的利益诉求。我们提出了一种替代性评估方法,优先考虑包容性AI系统:该方法可实证证明,这类系统不仅能为后续针对特定人群的响应定制保留必要知识,还能为效用最大化决策保留关键信息。