OXN -- Automated Observability Assessments for Cloud-Native Applications

Observability is important to ensure the reliability of microservice applications. These applications are often prone to failures, since they have many independent services deployed on heterogeneous environments. When employed "correctly", observability can help developers identify and troubleshoot faults quickly. However, instrumenting and configuring the observability of a microservice application is not trivial but tool-dependent and tied to costs. Practitioners need to understand observability-related trade-offs in order to weigh between different observability design alternatives. Still, these architectural design decisions are not supported by systematic methods and typically just rely on "professional intuition". To assess observability design trade-offs with concrete evidence, we advocate for conducting experiments that compare various design alternatives. Achieving a systematic and repeatable experiment process necessitates automation. We present a proof-of-concept implementation of an experiment tool - Observability eXperiment eNgine (OXN). OXN is able to inject arbitrary faults into an application, similar to Chaos Engineering, but also possesses the unique capability to modify the observability configuration, allowing for the straightforward assessment of design decisions that were previously left unexplored.

翻译：可观测性对于确保微服务应用的可靠性至关重要。这类应用通常部署在异构环境中，包含众多独立服务，因此极易发生故障。若"正确"运用可观测性，可帮助开发者快速识别并排除故障。然而，为微服务应用实施和配置可观测性并非易事，其过程既依赖工具又涉及成本。实践者需要理解可观测性相关的权衡取舍，以便在不同设计方案间做出抉择。但当前这类架构设计决策缺乏系统化方法支持，通常仅依赖"专业直觉"。为通过具体证据评估可观测性设计权衡，我们主张通过实验比较不同设计方案。要实现系统化且可复现的实验流程，自动化不可或缺。我们提出了一种实验工具的概念验证实现——可观测性实验引擎（OXN）。OXN能够向应用注入任意故障（类似于混沌工程），同时具备独特的可观测性配置修改能力，使得以往未被探索的设计决策能够获得直接评估。

相关内容

Automator

关注 5

Automator是苹果公司为他们的Mac OS X系统开发的一款软件。 只要通过点击拖拽鼠标等操作就可以将一系列动作组合成一个工作流，从而帮助你自动的（可重复的）完成一些复杂的工作。Automator还能横跨很多不同种类的程序，包括：查找器、Safari网络浏览器、iCal、地址簿或者其他的一些程序。它还能和一些第三方的程序一起工作，如微软的Office、Adobe公司的Photoshop或者Pixelmator等。

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

32+阅读 · 2021年9月29日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日