Tukey's boxplot is widely used for outlier detection; however, its classic fixed-fence rule tends to flag an excessive number of outliers as the sample size grows. To address this, we introduce two new R packages, ChauBoxplot and AdaptiveBoxplot, which implement more robust and statistically principled outlier detection methods. We illustrate their advantages and practical implications through comprehensive simulation studies and a real-world analysis of provincial university admission rates from China's National College Entrance Examination. Based on these findings, we provide practical guidance to help practitioners select appropriate boxplot methods, achieving a balance between interpretability and statistical reliability.
翻译:Tukey箱线图被广泛用于离群值检测,但其经典的固定围栏规则在样本量增大时倾向于标记过多的离群值。为解决这一问题,我们引入了两个新的R包——ChauBoxplot与AdaptiveBoxplot,它们实现了更稳健且基于统计原理的离群值检测方法。通过全面的模拟研究以及对中国高考省级大学录取率的实际数据分析,我们阐明了这些方法的优势与实际意义。基于这些发现,我们提供了实用指南,以帮助实践者选择合适的箱线图方法,在可解释性与统计可靠性之间取得平衡。