Interpreting Anticipatory Deep Reinforcement Learning for Proactive Mobile Network Control

Jabbari, MohammadErfan; Duttagupta, Abhishek; Fiandrino, Claudio; Bonati, Leonardo; D’Oro, Salvatore; Polese, Michele; Fiore, Marco; Melodia, Tommaso; Duttagupta, Abhishek

Fecha

2026-05

Resumen

Deep Reinforcement Learning (DRL) is widely used for adaptive control in mobile networks, yet most agents remain reactive. This limitation is particularly problematic for exogenous Key Performance Indicators (KPIs), whose dynamics cannot be directly controlled by agent action and evolve independently. Anticipatory DRL addresses this issue by augmenting the state with short-horizon KPIs forecasts, but it remains unclear whether such information truly influences decisions. We use SIA, a symbolic interpretability tool, to explain whether and how anticipatory information is actually exploited by the policy, enabling principled redesign of forecast inputs and performance improvements. Using policy graphs and Mutual Information (MI) over symbolic temporal features, SIA distinguishes proactive and reactive behaviors. Using a standard Pensieve ABR agent augmented with throughput forecasts, experiments on realworld 5G traces show a 3% average reward improvement, with anticipatory policies spending more time at high bitrates while reducing unnecessary oscillations.

Ficheros

SIA_poster_INFOCOM_26.pdf (150.6Kb)

Identificadores

URI: https://hdl.handle.net/20.500.12761/2013

Metadatos

Mostrar el registro completo del ítem