Resource sharing in multi-tenant cloud environments enables cost efficiency but introduces the Noisy Neighbor problem, i.e., co-located workloads that unpredictably degrade each other's performance. Despite extensive research on detecting such effects, there are no explainable methodologies for quantifying the severity of impact and establishing causal relationships among tenants. We propose an analytical that combines controlled experimentation with multi-stage causal inference and validates it across 10 independent rounds in a Kubernetes testbed. Our methodology not only quantifies severe performance degradations (e.g., up to 67\% in I/O-bound workloads under combined stress) but also statistically establishes causality through Granger causality analysis, revealing a 75\% increase in causal links when the noisy neighbor activates. Furthermore, we identify unique "degradation signatures" for each resource contention vector (i.e., CPU, memory, disk, network), enabling diagnostic capabilities that go beyond anomaly detection. This work transforms the Noisy Neighbor from an elusive problem into a quantifiable, diagnosable phenomenon, providing cloud operators with actionable insights for SLA management and smart resource allocation.
翻译:暂无翻译