Two linearly uncorrelated binary variables must be also independent because non-linear dependence cannot manifest with only two possible states. This inherent linearity is the atom of dependency constituting any complex form of relationship. Inspired by this observation, we develop a framework called binary expansion linear effect (BELIEF) for understanding arbitrary relationships with a binary outcome. Models from the BELIEF framework are easily interpretable because they describe the association of binary variables in the language of linear models, yielding convenient theoretical insight and striking Gaussian parallels. With BELIEF, one may study generalized linear models (GLM) through transparent linear models, providing insight into how the choice of link affects modeling. For example, setting a GLM interaction coefficient to zero does not necessarily lead to the kind of no-interaction model assumption as understood under their linear model counterparts. Furthermore, for a binary response, maximum likelihood estimation for GLMs paradoxically fails under complete separation, when the data are most discriminative, whereas BELIEF estimation automatically reveals the perfect predictor in the data that is responsible for complete separation. We explore these phenomena and provide related theoretical results. We also provide preliminary empirical demonstration of some theoretical results.
翻译:两个线性不相关的二值变量必然独立,因为仅具有两种可能状态时无法表现出非线性依赖关系。这种固有线性构成任何复杂关系形式的依赖原子。受此启发,我们提出名为"二值展开线性效应"(BELIEF)的框架,用于理解具有二值结果的任意关系。BELIEF框架中的模型易于解释,因其以线性模型的语言描述二值变量的关联性,从而获得便捷的理论洞见与显著的类高斯性质。借助BELIEF,可通过透明的线性模型研究广义线性模型(GLM),深入理解链接函数的选择如何影响建模。例如,将GLM交互系数设为零未必能实现线性模型框架下所理解的"无交互"模型假设。此外,对于二值响应变量,当数据最具判别性时,GLM的最大似然估计在完全分离条件下会反常失效,而BELIEF估计能自动揭示数据中导致完全分离的完美预测变量。我们探索了这些现象并给出相关理论结果,同时提供部分理论结果的初步实证演示。