Generative Artificial Intelligence (GenAI) is reshaping higher education and raising pressing concerns about the integrity and validity of higher education assessment. While assessment redesign is increasingly seen as a necessity, there is a relative lack of literature detailing what such redesign may entail. In this paper, we introduce assessment twins as an accessible approach for redesigning assessment tasks to enhance validity. We use Messick's unified validity framework to systematically map the ways in which GenAI threaten content, structural, consequential, generalisability, and external validity. Following this, we define assessment twins as two deliberately linked components that address the same learning outcomes through different modes of evidence, scheduled closely together to allow for cross-verification and assurance of learning. We argue that the twin approach helps mitigate validity threats by triangulating evidence across complementary formats, such as pairing essays with oral defences, group discussions, or practical demonstrations. We highlight several advantages: preservation of established assessment formats, reduction of reliance on surveillance technologies, and flexible use across cohort sizes. To guide implementation, we propose a three-step design process: identifying vulnerabilities, aligning outcomes, selecting complementary tasks, and developing interdependent marking schemes. We also acknowledge the challenges, including resource intensity, equity concerns, and the need for empirical validation. Nonetheless, we contend that assessment twins represent a validity-focused response to GenAI that prioritises pedagogy while supporting meaningful student learning outcomes.
翻译:生成式人工智能(GenAI)正在重塑高等教育体系,并对高等教育评估的完整性与效度提出了紧迫关切。尽管评估重构日益被视为必要举措,但详细阐述此类重构具体内涵的文献相对匮乏。本文提出"评估孪生体"这一可操作性框架,通过重构评估任务以提升效度。我们运用Messick的统一效度理论框架,系统分析了GenAI对内容效度、结构效度、后果效度、概化效度及外部效度构成的威胁。在此基础上,我们将评估孪生体定义为两个通过不同证据模式实现相同学习目标、且具有刻意关联性的评估组件,其紧密衔接的时间安排可实现交叉验证与学习成效保障。我们认为,这种孪生式方法通过在不同互补形式间进行证据三角验证(例如将论文与口头答辩、小组讨论或实践演示相结合),能有效缓解效度威胁。该方法具有多重优势:保留既有的评估形式、降低对监控技术的依赖、适用于不同规模的学员群体。为指导实践,我们提出三步设计流程:识别脆弱环节、校准学习目标、选择互补任务、制定相互依存的评分方案。同时我们也认识到实施挑战,包括资源密集性、公平性考量以及需要实证验证等问题。尽管如此,我们认为评估孪生体代表了一种以效度为核心、应对GenAI挑战的解决方案,其在支持学生实现实质性学习目标的同时,始终将教学法置于优先地位。