Henna: hierarchical machine learning inference in programmable switches
Date
2022-12-09Abstract
The recent proliferation of programmable network equipment has opened up new possibilities for embedding intelligence into the data plane. Deploying models directly in the data plane promises to achieve high throughput and low latency inference capabilities that cannot be attained with traditional closed loops involving control-plane operations. Recent efforts have paved the way for the integration of trained machine learning models in resource-constrained programmable switches, yet current solutions have significant limitations that translate into performance barriers when coping with complex inference tasks. In this paper, we present Henna, a first in-switch implementation of a hierarchical classification system. The concept underpinning our solution is that of splitting a difficult classification task into easier cascaded decisions, which can then be addressed with separated and resource-efficient tree-based classifiers. We propose a design of Henna that aligns with the internal organization of the Protocol Independent Switch Architecture (PISA), and integrates state-of-the-art strategies for mapping decision trees to switch hardware. We then implement Henna into a real testbed with off-the-shelf Intel Tofino programmable switches using the P4 language. Experiments with a complex 21-category classification task based on measurement data demonstrate how Henna improves the F1 score of an advanced single-stage model by 21%, while keeping usage of switch resources at 8% on average.