Unsupervised anomaly detection in network traffic using Deep Autoencoding Gaussian Mixture model

Leonid Safonov

Abstract


Unsupervised anomaly detection in high-dimensional data is an important subject of research in theoretical machine learning and applied areas. One of important applications is anomaly detection in network traffic data, which can be useful for preventing network security violations.

Unsupervised anomaly detection is based on density estimation, which is problematic in high-dimensional data. To deal with the issue dimensionality, reduction is performed first, and then the density is estimated in a space of smaller dimension.

Recently deep learning methods have been widely used in high-dimensional anomaly detection. One of such methods is the Deep Autoencoding Gaussian Mixture Model (DAGMM). DAGMM is a combination of a deep autoencoder, which performs dimensionality reduction and reconstruction error estimation, and a Gaussian mixture model, which predicts if a data sample is anomalous. We apply DAGMM to unsupervised anomaly detection in network traffic data. Testing anomaly detection system on network data presents a problem of lack of a generally accepted benchmark dataset, which would be recent, contain different types of attacks and have labels. We chose to use the UNSW-NB15 dataset, which satisfies these requirements and has been suggested as an up-to-date benchmark.

A correction to the algorithm, which improves anomaly detection accuracy is proposed.


Full Text:

PDF

References


S. Gulghane, V. Shingate, S. Bondgulwar, G. Awari, and P. Sagar, “A Survey on Intrusion Detection System Using Machine Learning Algorithms,” in: Innovative Data Communication Technologies and Application. ICIDCA 2019. Lecture Notes on Data Engineering and Communications Technologies, Springer, Cham, 2020, vol. 46, pp. 670-675.

V. Chandola, A. Banerjee, and V. Kumar, Anomaly Detection: A Survey, ACM Comput. Surv. 41. 10.1145/1541880.1541882, 2009.

Liu, H., Lang, B. Machine Learning and Deep Learning Methods for Intrusion Detection Systems: A Survey. Appl. Sci. 2019, vol. 9, p. 4396.

A. Aldweesh, A. Derhab, A. Z. Emam, “Deep learning approaches for anomaly-based intrusion detection systems: A survey, taxonomy, and open issues,“ Knowledge-Based Systems, vol. 189, 105124, 2020.

E. Hodo, X.J. Bellekens, A.W. Hamilton, C. Tachtatzis, and R.C. Atkinson, Shallow and Deep Networks Intrusion Detection System: A Taxonomy and Survey. ArXiv, abs/1701.02145, 2017.

E. J. Candès, X. Li, Y. Ma, and J. Wright, “Robust principal component analysis?,” JACM, vol. 58 (3), article 11, pp. 1 – 37.

B. Zong, Q. Song, M. R. Min, W. Cheng, C. Lumezanu, D. Cho, and H. Chen, “Deep autoencoding Gaussian mixture model for unsupervised anomaly detection,” in 6th International Conference on Learning Representations, 2018.

“KDD Cup 1999 Data,” Available: http://kdd.ics.uci.edu/databases/kddcup99/kddcup99.html

A. Divekar, M. Parekh, V. Savla, R. Mishra, and M. Shirole, “Benchmarking datasets for anomaly-based network intrusion detection: KDD CUP 99 alternatives.” IEEE 3rd International Conference on Computing, Communication and Security (ICCCS), 2018.

M. Ring, S. Wunderlich, D. Scheuring, D. Landes, and A. Hotho, “A survey of network-based intrusion detection data sets,” ArXiv abs/1903.02460, 2019.

I. Sharasaldin, A. H. Lashkari, and A. A. Ghorbani, “Toward generating s new intrusion detection dataset and intrusion traffic characterization”, International Conference on Information Systems Security and Privacy (ICICSP), 2018, pp. 108-116.

M. Ring, S. Wunderlich, D. Scheuring, D. Landes, and A. Hotho, “Flow-based benchmark datasets for intrusion detection”, International Conference on Cybernetic Intelligent Systems (CIS), ACPI, 2017, pp. 361-369.

G. Marciá-Fernández, J. Camacho, R. Magán-Carrión, P. García-Teodoro, and R. Therón, UGR’16: A new dataset for the evaluation of cyclostantionarity-based network IDSs, Computers & Security 2018, vol. 73, pp. 411-424.

N. Moustafa and J. Slay, "UNSW-NB15: a comprehensive data set for network intrusion detection systems (UNSW-NB15 network data set)," 2015 Military Communications and Information Systems Conference (MilCIS), Canberra, ACT, 2015, pp. 1-6.


Refbacks

  • There are currently no refbacks.


Abava  Absolutech Convergent 2020

ISSN: 2307-8162