Capabilities and limitations of attack classification in encrypted traffic by machine learning methods

M. Bondarev; O. Guzev

Capabilities and limitations of attack classification in encrypted traffic by machine learning methods

M. Bondarev, O. Guzev

Abstract

This paper investigates the capabilities and limitations of applying machine learning methods for computer attacks classifying within encrypted network traffic. This research is relevant due to the widespread adoption of encryption, which complicates the operation of traditional traffic analysis methods. The article proposes a comprehensive data model incorporating statistical flow characteristics, packet sequences, and payload byte distribution. A comparative analysis of machine learning algorithms: Random Forest, Extra Trees, Decision Tree, AdaBoost, and KNN was conducted on three public datasets: CIC-IDS2017, TON IoT, and USTC-TFC2016. Random Forest and Extra Trees algorithms demonstrated the best results, Extra Trees showing slightly higher stability. Hyperparameter optimization for the Extra Trees model was performed, proving its effectiveness when applied within a single dataset. A key limitation of the proposed approach is the model's low generalization capability. Cross-dataset testing revealed a significant decrease in classification performance, indicating a dependency of the results on the specific network environment. Applied methods to improve generalization, including feature filtering, neural network implementation, and domain adaptation, did not yield a sufficiently robust or universally applicable solution. The study concludes that machine learning models are applicable for effective detection and classification of computer attacks within a single network environment. However, the portability of trained models between different network environments remains limited and requires further research.

Full Text:

PDF (Russian)

References

Google Transparency Report. [Online]. Available: https://transparencyreport.google.com/archive/https/overview.

B. Anderson, S. Paul, and D. A. McGrew, "Deciphering malware’s use of TLS (without decryption)," Journal of Computer Virology and Hacking Techniques, vol. 14, pp. 195–211, 2016.

A. Gouveia and M. P. Correia, "Network intrusion detection with XGBoost," Recent Advances in Security, Privacy, and Trust for Internet of Things (IoT) and Cyber-Physical Systems (CPS), 2020.

J. Koumar, K. Hynek, and T. Čejka, "Network traffic classification based on single flow time series analysis," in Proc. 2023 19th Int. Conf. Network and Service Management (CNSM), 2023, pp. 1–7, doi: 10.23919/CNSM59352.2023.10327876.

A. P. Singh, M. Singh, K. Bhatia, and H. Pathak, "Encrypted malware detection methodology without decryption using deep learning-based approaches," Turkish Journal of Engineering, vol. 8, pp. 498–509, 2024, doi: 10.31127/tuje.1416933.

O. Belarbi, A. Khan, P. E. Carnelli, and T. Spyridopoulos, "An intrusion detection system based on deep belief networks," in Proc. Int. Conf. Science of Cyber Security, 2022.

UNB CIC, "CICFlowMeter." [Online]. Available: https://www.unb.ca/cic/research/applications.html.

I. Sharafaldin, A. H. Lashkari, and A. A. Ghorbani, "Toward generating a new intrusion detection dataset and intrusion traffic characterization," in Proc. 4th Int. Conf. Information Systems Security and Privacy (ICISSP), Portugal, Jan. 2018.

N. Moustafa, "A new distributed architecture for evaluating AI-based security systems at the edge: Network TON_IoT datasets," Sustainable Cities and Society, vol. 72, Art. no. 102994, 2021, doi: https://doi.org/10.1016/j.scs.2021.102994.

W. Wang, M. Zhu, X. Zeng, X. Ye, and Y. Sheng, "Malware traffic classification using convolutional neural network for representation learning," in Proc. 2017 Int. Conf. Information Networking (ICOIN), 2017, pp. 712–717, doi: 10.1109/ICOIN.2017.7899588.

Optuna, "Optuna - A hyperparameter optimization framework." [Online]. Available: https://optuna.org/.

Scikit-learn, "Scikit-Learn." [Online]. Available: https://scikit-learn.org/stable/index.html.

T.-H. Chua and I. Salam, "Evaluation of machine learning algorithms in network-based intrusion detection system," arXiv preprint arXiv:2203.05232, 2022. [Online]. Available: https://arxiv.org/abs/2203.05232.

Y. Ganin et al., "Domain-adversarial training of neural networks," Journal of machine learning research, vol. 17, no. 59, pp. 1–35, 2016.

T. M. Booij, I. Chiscop, E. Meeuwissen, N. Moustafa, and F. T. H. den Hartog, "ToN_IoT: The role of heterogeneity and the need for standardization of features and attack types in IoT network intrusion data sets," IEEE Internet Things J., vol. 9, no. 1, pp. 485–496, Jan. 2022, doi: 10.1109/JIOT.2021.3085194.

A. Alsaedi, N. Moustafa, Z. Tari, A. Mahmood, and A. Anwar, "TON_IoT telemetry dataset: A new generation dataset of IoT and IIoT for data-driven intrusion detection systems," IEEE Access, vol. 8, pp. 165130–165150, 2020, doi: 10.1109/ACCESS.2020.3022862.

N. Moustafa, M. Keshky, E. Debiez, and H. Janicke, "Federated TON_IoT windows datasets for evaluating AI-based security applications," in Proc. 2020 IEEE 19th Int. Conf. Trust, Security and Privacy Comput. Commun. (TrustCom), 2020, pp. 848–855, doi: 10.1109/TrustCom50675.2020.00114.

N. Moustafa, M. Ahmed, and S. Ahmed, "Data analytics-enabled intrusion detection: Evaluations of ToN_IoT Linux datasets," in Proc. 2020 IEEE 19th Int. Conf. Trust, Security and Privacy Comput. Commun. (TrustCom), 2020, pp. 727–735, doi: 10.1109/TrustCom50675.2020.00100.

N. Moustafa, "New generations of Internet of Things datasets for cybersecurity applications based machine learning: TON_IoT datasets," presented at the eResearch Australasia Conf., Brisbane, Australia, 2019. [Online]. Available: https://conference.eresearch.edu.au/wp-content/uploads/2019/08/2019_eResearch_59_New-Generations-of-Internet-of-Things-Datasets-for-Cybersecurity.pdf.

N. Moustafa, "A systemic IoT-fog-cloud architecture for big-data analytics and cyber security systems: A review of fog computing," arXiv preprint arXiv:1906.01055, 2019. [Online]. Available: https://arxiv.org/abs/1906.01055.

J. Ashraf et al., "IoTBoT-IDS: A novel statistical learning-enabled botnet detection framework for protecting networks of smart cities," Sustainable Cities and Society, vol. 72, Art. no. 103041, 2021, doi: https://doi.org/10.1016/j.scs.2021.103041

Refbacks

There are currently no refbacks.

Abava Кибербезопасность Monetec 2026 СНЭ

ISSN: 2307-8162

International Journal of Open Information Technologies