Fail-operational systems are a prerequisite for autonomous driving. Without a driver who can act as a fallback solution in a critical failure scenario, the system has to be able to mitigate failures on its own and keep critical applications operational. To reduce redundancy cost, graceful degradation can be applied by repurposing hardware resources at run-time. Critical applications can be kept operational by starting passive backups and shutting down non-critical applications instead to make sufficient resources available. In order to design such systems efficiently, the degradation effects on reliability and cost savings have to be analyzed. In this paper we present our approach to formally analyze the impact of graceful degradation on the reliability of critical and non-critical applications. We then quantify the effect of graceful degradation on the reliability of both critical and non-critical applications in distributed automotive systems and compare the achieved cost reduction with conventional redundancy approaches. In our experiments redundancy overhead could be reduced by 80% compared to active redundancy in a scenario with a balanced mix of critical and non-critical applications using our graceful degradation approach Overall, we present a detailed reliability and cost analysis of graceful degradation in distributed automotive systems. Our findings confirm that using graceful degradation can tremendously reduce cost compared to conventional redundancy approaches with no negative impact on the redundancy of critical applications if a reliability reduction of non-critical applications can be accepted. Our results show that a trade-off between the impact of the degradation on the reliability of non-critical applications and cost reduction has to be made.
翻译:失效可运行系统是自动驾驶的先决条件。在关键故障场景中,若没有可作为后备解决方案的驾驶员,系统必须能够自主缓解故障,并保持关键应用程序持续运行。为降低冗余成本,可通过在运行时重新分配硬件资源的方式实现优雅降级。通过启动被动备份并关闭非关键应用程序以释放充足资源,关键应用程序可保持运行状态。为高效设计此类系统,需分析降级对可靠性及成本节约的影响。本文提出了一种形式化分析方法,用于评估优雅降级对关键与非关键应用程序可靠性的影响。我们量化了分布式汽车系统中优雅降级对两类应用程序可靠性的作用,并将所实现的成本降低与传统冗余方法进行了对比。实验表明,在关键与非关键应用程序均衡混合的场景中,采用我们提出的优雅降级方法,冗余开销相比主动冗余减少了80%。总体而言,我们提供了分布式汽车系统中优雅降级的详细可靠性与成本分析。研究结果证实,若可接受非关键应用程序的可靠性降低,优雅降级相比传统冗余方法能大幅降低成本,且对关键应用程序的冗余性无负面影响。结果表明,需在降级对非关键应用程序可靠性的影响与成本降低之间进行权衡。