Strong experimental papers in electrical and computer engineering and computer science (ECE/CS), especially in systems, networking, and applied machine learning, rest on more than a single impressive number. They rest on a chain of design, measurement, analysis, and validation choices that, taken together, make a result believable. This tutorial is a compact, example-driven guide to that chain for beginning researchers. We organize it as an evaluation workflow: claim, hypothesis, unit of analysis, baseline, regime sweep, uncertainty estimate, validation check, and reporting. Within that workflow we cover the classical statistical foundations (descriptive statistics, the central limit theorem, normal- and $t$-based confidence intervals, Student's $t$-test, ANOVA, chi-squared and Pearson correlation, linear regression) alongside the modern, distribution-free techniques (the bootstrap, Wilcoxon and Mann--Whitney tests, Cliff's delta) that are usually preferred for ECE/CS data. We also discuss factorial design, randomization and blocking, multiple-comparison correction, latency-specific pitfalls, simulation verification and validation, equivalence-style claims, and reproducibility. A running example, a comparison of two job-scheduling algorithms on simulated workloads with truncated heavy-tailed job sizes, threads through the tutorial, with Python snippets the reader can paste and adapt. The paper closes with a pre-submission checklist; companion student-facing material (project-type translation tables, an evaluation-plan worksheet, exercises, and a worked ``bad evaluation autopsy'') is collected in a separate workbook released alongside this paper.
翻译:电气与计算机工程及计算机科学(ECE/CS)领域中的高质量实验论文——尤其是系统、网络及应用机器学习方向——并非仅依赖单一惊艳数据。其可信度源于设计、测量、分析和验证环节构成的一整套逻辑链。本教程面向初研学者,以实例驱动的方式系统梳理这一逻辑链。我们将评估流程组织为:声明主张、提出假设、确定分析单元、构建基线、进行参数扫描、量化不确定性、实施验证检查、撰写报告。在该框架中,我们既涵盖经典统计学基础(描述性统计、中心极限定理、基于正态分布与t分布的置信区间、学生t检验、方差分析、卡方检验、皮尔逊相关系数、线性回归),亦囊括更适用于ECE/CS数据的现代无分布方法(自助法、Wilcoxon秩和检验、Mann–Whitney检验、Cliff's delta)。此外,我们探讨了析因设计、随机化与区组设计、多重比较校正、延迟特质陷阱、仿真验证与确认、等价性声明及可复现性。教程以双作业调度算法对比实例贯穿始终——基于截断重尾分布作业规模模拟负载的调度比较,并附带可粘贴调用的Python代码片段。论文末尾附提交前核查清单,配套学生辅助材料(项目类型对照表、评估计划工作表、习题及"错误评估剖析"案例)收录于随附的工作手册中。