Generative artificial intelligence (GenAI) is increasingly being integrated into complex business workflows, fundamentally shifting the boundaries of managerial decision-making. However, the reliability of its strategic advice in ambiguous business contexts remains a critical knowledge gap. To address this gap, this study compares multiple GenAI models in their ability to detect ambiguity, examines whether a systematic ambiguity-resolution process improves response quality, and investigates their susceptibility to sycophantic behavior when confronted with flawed managerial directives. Using a novel four-dimensional business ambiguity taxonomy, we conducted a human-in-the-loop experiment across strategic, tactical, and operational scenarios. The resulting decisions were assessed through a human-validated automated evaluation framework based on agreement, actionability, justification quality, and constraint adherence. The results show that our approach not only distinguishes different types of ambiguity, but also reveals how ambiguity resolution systematically changes model behavior. In particular, resolving ambiguities improved decision quality across all managerial levels, with the strongest gains observed in constraint adherence. The analysis further showed that sycophantic behavior is not uniform across models: some models challenged flawed assumptions, whereas others tended to comply with them. This study contributes to the bounded rationality literature by positioning GenAI as a cognitive scaffold that can detect and resolve ambiguities managers might overlook, while demonstrating that its artificial limitations require human oversight to ensure its reliability as a strategic partner.
翻译:生成式人工智能正日益融入复杂商业工作流,从根本上改变了管理决策的边界。然而,其在模糊商业情境中提供战略建议的可靠性仍是一个关键知识空白。为填补这一空白,本研究比较了多种生成式人工智能模型在识别模糊性方面的能力,考察系统性的模糊消解过程是否能提升响应质量,并探究这些模型在应对存在缺陷的管理指令时对谄媚行为的易感性。我们采用新颖的四维商业模糊性分类法,在战略、战术与运营三类情境中开展了人机协同实验。通过基于一致性、可操作性、论证质量与约束遵循度的经人工验证的自动化评估框架对生成决策进行评价。结果表明,我们的方法不仅能区分不同类型的模糊性,还揭示了模糊消解如何系统性改变模型行为。特别值得关注的是,消除模糊性在所有管理层次上均提升了决策质量,其中约束遵循度的改善最为显著。进一步分析显示,谄媚行为在不同模型间存在差异:部分模型能挑战缺陷假设,而另一些则倾向于遵从。本研究将生成式人工智能定位为能识别并消解管理者可能忽略的模糊性的认知支架,同时揭示其人工局限性需要人类监督来确保作为战略伙伴的可靠性,从而拓展了有限理性理论。