The Impact of Flow Feature Extraction Techniques in Machine Learning-based Traffic Classification

Dr. Pekár Adrián

Traffic classification has always played a strategic role in understanding and managing computer networks. The methodological landscape of networking engineering has gradually integrated a wide variety of techniques into traffic classification, including machine learning. In general, the efficacy of the developed machine learning classifiers depends on the input data, creating a growing demand to maintain high-quality (labeled) datasets whose volume, representativeness, and heterogeneity are sufficient to achieve the desired performance.

The analysis of network traffic flow characteristics, like duration, size, and rate, has been a frequently studied topic before, while their efficacy in traffic classification has been shown significant. However, the evolution of computer networks has also yielded other flow feature extraction techniques, whose usefulness in network traffic flow classification has been investigated poorly to date. 

The task of the student is to examine the efficacy of various flow feature extraction techniques for machine learning-based network traffic classification. In the course of the work, the student will learn about techniques of network traffic measurement, how to analyze and use the measured data, and the most popular methods for machine learning-based classification. Building on the knowledge acquired, the student will provide a comparative analysis of classifiers fed with different flow feature extraction techniques.