We propose two novel approaches to address a critical problem of reach measurement across multiple media -- how to estimate the reach of an unobserved subset of buying groups (BGs) based on the observed reach of other subsets of BGs. Specifically, we propose a model-free approach and a model-based approach. The former provides a coarse estimate for the reach of any subset by leveraging the consistency among the reach of different subsets. Linear programming is used to capture the constraints of the reach consistency. This produces an upper and a lower bound for the reach of any subset. The latter provides a point estimate for the reach of any subset. The key idea behind the latter is to exploit the conditional independence model. In particular, the groups of the model are created by assuming each BG has either high or low reach probability in a group, and the weights of each group are determined through solving a non-negative least squares (NNLS) problem. In addition, we also provide a framework to give both confidence interval and point estimates by integrating these two approaches with training points selection and parameter fine-tuning through cross-validation. Finally, we evaluate the two approaches through experiments on synthetic data.
翻译:我们提出两种新颖的方法来解决跨媒体到达率测量中的一个关键问题——如何基于已观测的购买群体(BGs)子集到达率,估计未观测子集的到达率。具体而言,我们提出了一种无模型方法和一种基于模型的方法。前者通过利用不同子集到达率之间的一致性,为任意子集的到达率提供粗略估计。线性规划被用于捕获到达率一致性的约束条件,从而生成任意子集到达率的上界和下界。后者则为任意子集的到达率提供点估计,其核心思想在于利用条件独立模型。具体来说,通过假设每个BG在群体中具有高或低的到达概率来创建模型群体,并通过求解非负最小二乘(NNLS)问题确定每个群体的权重。此外,我们还提供了一个框架,通过结合这两种方法并利用训练点选择和交叉验证进行参数微调,同时给出置信区间和点估计。最后,我们通过合成数据实验对两种方法进行了评估。