Keystroke Verification Challenge (KVC): Biometric and Fairness Benchmark Evaluation

Analyzing keystroke dynamics (KD) for biometric verification has several advantages: it is among the most discriminative behavioral traits; keyboards are among the most common human-computer interfaces, being the primary means for users to enter textual data; its acquisition does not require additional hardware, and its processing is relatively lightweight; and it allows for transparently recognizing subjects. However, the heterogeneity of experimental protocols and metrics, and the limited size of the databases adopted in the literature impede direct comparisons between different systems, thus representing an obstacle in the advancement of keystroke biometrics. To alleviate this aspect, we present a new experimental framework to benchmark KD-based biometric verification performance and fairness based on tweet-long sequences of variable transcript text from over 185,000 subjects, acquired through desktop and mobile keyboards, extracted from the Aalto Keystroke Databases. The framework runs on CodaLab in the form of the Keystroke Verification Challenge (KVC). Moreover, we also introduce a novel fairness metric, the Skewed Impostor Ratio (SIR), to capture inter- and intra-demographic group bias patterns in the verification scores. We demonstrate the usefulness of the proposed framework by employing two state-of-the-art keystroke verification systems, TypeNet and TypeFormer, to compare different sets of input features, achieving a less privacy-invasive system, by discarding the analysis of text content (ASCII codes of the keys pressed) in favor of extended features in the time domain. Our experiments show that this approach allows to maintain satisfactory performance.

翻译：分析击键动态（KD）进行生物特征验证具有多项优势：它是最具区分性的行为特征之一；键盘作为最普遍的人机交互界面，是用户输入文本数据的主要工具；其数据采集无需额外硬件，处理过程相对轻量；并且能够以透明方式识别用户。然而，文献中采用的实验协议与评价指标存在异质性，且数据库规模有限，阻碍了不同系统间的直接比较，成为击键生物特征研究发展的障碍。为解决这一问题，我们提出新型实验框架，基于阿尔托击键数据库（Aalto Keystroke Databases）中来自超过18.5万受试者、通过台式机和移动键盘采集的推文长度可变转录文本序列，对基于KD的生物特征验证性能与公平性进行基准测试。该框架以"击键验证挑战赛"（KVC）形式部署于CodaLab平台。此外，我们创新性地提出公平性指标——偏斜冒名者比率（SIR），用以捕捉验证分数中跨人口群体及群体内部的偏差模式。通过采用TypeNet与TypeFormer两种最新击键验证系统，我们对比不同输入特征集（摒弃文本内容分析即按键ASCII码，转而采用时域扩展特征）从而构建隐私侵入性更低的系统，验证了所提框架的有效性。实验表明，该方法在保持良好性能的同时降低了隐私风险。