Separating Wheat from Chaff: Winnowing Unintended Prefixes using Machine Learning
Date
2014-04-27Abstract
In this paper, we propose the use of prefix visibility at
the interdomain level as an early symptom of anomalous events in the Internet. We focus on detecting anomalies which, despite their significant impact on the routing system, remain concealed from state of the art tools. We design a machine learning system to winnow the prefixes with unintended limited visibility – symptomatic of anomalous events – from the prefixes with intended limited visibility – resulting from legitimate routing operations.
We train a winnowing algorithm with ground-truth data on 20,000 operational limited visibility prefixes (LVPs) already classified by the operators of the origin networks. The ground-truth was collected using the BGP Visibility Scanner, a tool we developed to provide operators with a multi-angle view on the efficacy of their routing policies. We build a dataset with the pre-classified prefixes
and the features describing their visibility status dynamics. We further use this dataset to derive a boosted decision tree which winnows unintended LVPs with an accuracy of 95%.