Performance Evaluation and Anomaly detection in Mobile BroadBand Across Europe
Autor(es)
Moulay, MohamedSupervisor(es)/Director(es)
Mancuso, VincenzoFecha
2022-07-20Resumen
With the rapidly growing market for smartphones and user’s confidence for immediate access to high-quality multimedia content, the delivery of video over wireless networks has become a big challenge. It makes it challenging to accommodate end-users with flawless quality of service. The growth of the smartphone market goes hand in hand with the development of the Internet, in which current transport protocols are being re-evaluated to deal with traffic growth. QUIC and WebRTC are new and evolving standards. The latter is a unique and evolving standard explicitly developed to meet this demand and enable a high-quality experience for mobile users of real-time communication services. QUIC has been designed to reduce Web latency, integrate security features, and allow a highquality experience for mobile users. Thus, the need to evaluate the performance of these rising protocols in a non-systematic environment is essential to understand the behavior of the network and provide the end user with a better multimedia delivery service. Since most of the work in the research community is conducted in a controlled environment, we leverage the MONROE platform to investigate the performance of QUIC and WebRTC in real cellular networks using static and mobile nodes. During this Thesis, we conduct measurements ofWebRTC and QUIC while making their data sets public to the interested experimenter. Building such data sets is very welcomed with the research community, opening doors to applying data science to network data sets. The development part of the experiments involves building Docker containers that act as QUIC and WebRTC clients. These containers are publicly available to be used candidly or within the MONROE platform. These key contributions span from Chapter 4 to Chapter 5 presented in Part II of the Thesis.
We exploit data collection from MONROE to apply data science over network data sets, which will help identify networking problems shifting the Thesis focus from performance evaluation to a data science problem.
Indeed, the second part of the Thesis focuses on interpretable data science. Identifying network problems leveraging Machine Learning (ML) has gained much visibility in the past few years, resulting in dramatically improved cellular network services. However, critical tasks like troubleshooting cellular networks are still performed manually by experts who monitor the network around the clock.
In this context, this Thesis contributes by proposing the use of simple interpretable ML algorithms, moving away from the current trend of high-accuracy ML algorithms (e.g., deep learning) that do not allow interpretation (and hence understanding) of their outcome. We prefer having lower accuracy since we consider it interesting (anomalous) the scenarios misclassified by the ML algorithms, and we do not want to miss them by overfitting. To this aim, we design TTrees (from Troubleshooting Trees), a practical and interpretable ML software tool that implements an unsupervised methodology we have designed to automate the causes of performance anomalies in a cellular network and compare it to a supervised counterpart, named STress (from Supervised Trees). Both methodologies require small volumes of data and are quick at training. Our experiments using real data from operational commercial mobile networks e.g., sampled with MONROE probes, show that STrees and TTrees can automatically identify and accurately classify network anomalies—e.g., cases for which a low network performance is not justified by operational conditions—training with just a few hundreds of data samples, hence enabling precise troubleshooting actions. Most importantly, our experiments show that a fully automated unsupervised approach is viable and efficient. In Part III of the Thesis which includes Chapter 6 and 7.
In conclusion, in this Thesis, we go through a data-driven networking roller coaster, from performance evaluating upcoming network protocols in real mobile networks to building methodologies that help identify and classify the root cause of networking problems, emphasizing the fact that these methodologies are easy to implement and can be deployed in production environments.