This paper empirically examines the practical validity of the official evaluation criteria underpinning the Research Productivity (PQ) Grant framework, as governed by the Brazilian National Council for Scientific and Technological Development (CNPq). By operationalizing regulatory dimensions (including bibliographic output, human resource training, and scientific recognition) as measurable variables extracted from CVs and OpenAlex bibliometric data, we treat policy-defined indicators as testable hypotheses rather than a priori assumptions. Using a block-based adaptation of the Boruta feature selection algorithm across several machine learning classifiers, we evaluate the statistical contribution of each dimension in distinguishing grant levels, with a focus on identifying top-tier (Level 1A) researchers. Our models achieve high predictive performance, with mean AUC scores reaching 0.96, indicating that PQ levels carry a robust and structured statistical signal. However, explanatory power is heavily concentrated within a limited subset of features, specifically bibliographic production, graduate-level supervision and institutional management roles. Conversely, several criteria explicitly emphasized in the regulations demonstrated no detectable statistical contribution to classification outcomes. These findings reveal a potential misalignment between the formal regulatory framework and the effective signals driving evaluation outcomes, suggesting that the practical evaluative signal is substantially more compact than officially stated and providing evidence-based insights for the refinement and transparency of research assessment policies.
翻译:本文实证检验了巴西国家科学技术发展委员会(CNPq)所管辖的科研生产力(PQ)资助框架中官方评估标准的实际有效性。通过将法规维度(包括文献产出、人力资源培养和科学认可度)操作化为从简历和OpenAlex文献计量数据中提取的可测量变量,我们将政策定义的指标视为可检验的假设而非先验假设。采用基于分块的Boruta特征选择算法,结合多种机器学习分类器,我们评估了每个维度在区分资助级别中的统计贡献,重点关注顶级(1A级)研究人员的识别。我们的模型实现了高预测性能,平均AUC得分达到0.96,表明PQ级别承载着稳健且结构化的统计信号。然而,解释力高度集中于有限的特征子集,具体表现为文献产出、研究生指导及机构管理角色。相反,法规中明确强调的若干标准在分类结果中未显示出可检测的统计贡献。这些发现揭示了正式监管框架与驱动评估结果的有效信号之间潜在的不匹配,表明实际评估信号比官方声明更为精简,并为研究评估政策的完善和透明度提供了基于证据的见解。