Memory as a Wasting Asset: Pricing Flash Endurance for Embodied Agents, and the Limits of Doing So

A robot's flash endurance is a non-renewable stock: every persisted write spends one of a few thousand program/erase cycles and never refills, yet no fielded robot memory system prices which memories are worth an erase cycle. We treat embodied memory as depreciating capital and price that stock with a single endurance shadow price $η$, which makes cost-minimizing placement across a RAM / on-board NVM / cloud hierarchy a threshold in a wear-augmented per-byte index. The index is cost-optimal whatever the sign of the value-write association $χ$; only when $χ> 0$ does the optimum turn non-monotone, sending a robot's most valuable memories off its flash. The pivot is thus empirical, and we measure $χ$ on real robot logs at a pre-specified gate: its sign is a property of the deployment regime -- positive on recurrent long-horizon manipulation ($\hatχ \approx +1.0 \times 10^{-3}$, replicated at full power), null on a shorter-horizon suite, and negative on non-recurrent teleoperation. Two boundaries scope the result. The endurance budget is dormant on premium 3,000-P/E TLC at datasheet prices and binding on the commodity QLC/eMMC ($\sim$1,000 P/E) that cheaper edge robots run. And where it binds, a learned wear-aware controller only ties price-based routing on task value, because realized value is tier-invariant across RAM, NVM, and cloud: the rent governs device lifetime and cost, not task performance. Whether wear-aware placement improves task value remains open -- $χ$ is measured against a value proxy, and the non-monotone optimum, while proven, is not yet observed in data.

翻译：机器人的闪存耐久性是一种不可再生资源：每次持久化写入都会消耗数千次编程/擦除循环中的一次且不可恢复，然而现有机器人内存系统均未对哪些记忆值得消耗一次擦除循环进行定价。我们将具身记忆视为折旧资本，并通过单一耐久性影子价格 $η$ 对其进行定价，这使得在RAM/板载NVM/云存储层级中实现成本最小化的部署策略成为一个基于磨损增强型每字节索引的阈值决策。无论价值-写入关联度 $χ$ 的正负，该索引均能实现成本最优；仅当 $χ> 0$ 时，最优策略转为非单调，促使机器人最有价值的记忆从闪存中迁出。因此关键变量是经验性的，我们通过预设门控机制在真实机器人日志上测量 $χ$：其正负取决于部署模式——在重复性长时域操作中为正（$\hatχ \approx +1.0 \times 10^{-3}$，满功率复现），在短时域操作任务中为零，在非重复性远程操作中为负。该结论受限于两个边界条件：在优质3,000次P/E循环TLC闪存（数据手册价格）中耐久性预算无效，而在廉价边缘机器人常用的QLC/eMMC（约1,000次P/E循环）中则构成约束。当约束生效时，基于学习机制的磨损感知控制器仅能通过任务价值匹配基于价格的路径选择，因为实际价值在RAM、NVM和云存储层级间保持恒定：租金决定设备寿命与成本而非任务性能。磨损感知部署能否提升任务价值仍是开放性问题——$χ$ 基于价值代理指标测量，且非单调最优策略虽然已被理论证明，尚未在数据中被观测到。