Diagnostics for categorical response models based on quantile residuals and distance measures

Polytomous categorical data are frequent in studies, that can be obtained with an individual or grouped structure. In both structures, the generalized logit model is commonly used to relate the covariates on the response variable. After fitting a model, one of the challenges is the definition of an appropriate residual and choosing diagnostic techniques. Since the polytomous variable is multivariate, raw, Pearson, or deviance residuals are vectors and their asymptotic distribution is generally unknown, which leads to difficulties in graphical visualization and interpretation. Therefore, the definition of appropriate residuals and the choice of the correct analysis in diagnostic tools is important, especially for nominal data, where a restriction of methods is observed. This paper proposes the use of randomized quantile residuals associated with individual and grouped nominal data, as well as Euclidean and Mahalanobis distance measures, as an alternative to reduce the dimension of the residuals. We developed simulation studies with both data structures associated. The half-normal plots with simulation envelopes were used to assess model performance. These studies demonstrated a good performance of the quantile residuals, and the distance measurements allowed a better interpretation of the graphical techniques. We illustrate the proposed procedures with two applications to real data.

翻译：多元分类数据在研究中常见，可呈现个体或分组结构。在这两种结构下，广义logit模型通常用于关联协变量与响应变量。模型拟合后，挑战之一在于定义适当的残差并选择诊断技术。由于多元变量是多维的，原始残差、皮尔逊残差或偏差残差均为向量，其渐近分布通常未知，导致图形可视化和解释困难。因此，定义适当的残差并选择诊断工具中的正确分析方法尤为重要，尤其是对于名义数据，此类数据存在方法局限性。本文提出使用与个体和分组名义数据相关的随机分位数残差，以及欧几里得距离和马氏距离测度，作为降低残差维度的替代方案。我们针对两种数据结构开展了模拟研究。采用带模拟包络的半正态图评估模型性能。这些研究表明分位数残差性能良好，而距离测度有助于更好地解释图形技术。我们通过两个真实数据应用案例展示了所提出的流程。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

《生成式模型: 变分自编码器与扩散模型》，75页ppt，Google DeepMind科学家Ruiqi Gao

专知会员服务

66+阅读 · 2023年6月10日

【CVPR 2022】一个完全无监督的框架，从噪声和部分测量中学习图像，Robust Equivariant Imaging: a fully unsupervised framework for learning to image

专知会员服务

25+阅读 · 2022年3月3日

生成性对抗网络:理论模型、评估指标和最近发展的概述，Generative Adversarial Networks (GANs): An Overview of Theoretical Model, Evaluation Metrics, and Recent Developments

专知会员服务

42+阅读 · 2020年5月30日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日