Data Sampling Techniques for Anomaly Detection in Network Traffic

G.A. Zubrienko, O.R. Laponina


An architecture and a simple implementation of auto-scaling intrusion detection system is considered. A proposed concept combines efficiency of signature-based detection with flexibility and performance of adaptive learning. There are several goals founding the concept: reduction of computing resources required by IDS, reduction of the amount of data stored. These goals are achieved by optimizing a data sampling algorithm feeding the source data used in training signature-based classifier. Thus, anomalies are detected using classic anomaly detection methods with further update of training dataset and signature-based anomaly classifier as well. This approach allows not only real-time attack detection, but also rapid adaptation of the signature-based classifier to new attack types.

Full Text:

PDF (Russian)


A. Shabtai, Y. Elovici, L. Rokach A Survey of Data Leakage Detection and Prevention Solutions // Springer Briefs In Computer Science. 2012

R. Mogull Understanding and Selecting a Data Loss Prevention Solution // SANS Institute

Y. Kim, N. Park, S.K. Un An Advanced Data Loss Prevention System Being Able to Respond Data-Leaking Incidents Using e-Discovery Primitives // WorldComp 2012

C. Phua Protecting Organisations from Personal Data Breaches // Computer Fraud & Security. January 2009

S. Dua, X. Du Data Mining and Machine Learning in Cybersecurity // Auerbach Publications P. 62-65

D. Du, L. Yu, R.R. Brooks Sematic Similarity Detection For Data Leak Prevention // CISR’15 Proceedings of the 10th Annual Cyber and Information Security Research Conference. Article No. 4. ACM New York, NY, USA. 2015

G. Xiang, J. Hong, C.P. Rose, L.Cranor CANTINA+: A Feature-Rich Machine Learning Framework for Detecting Phishing Web Sites // ACM Trans. Inf. Syst. Secur. 14, 2, Article 21. September 2011

M. Goldstein, A. Dengel Histogram-based Outlier Score (HBOS): A fast Unsupervised Anomaly Detection Algorithm // KI-2012: Poster and Demo Track. p. 59-63.

A. Abraham, C. Grosan, C. Martin-Vide Evolutionary Design of Intrusion Detection Programs // International Journal of Network Security, Vol. 4, No.3, March 2007 PP. 328-339.

S. Dua, X. Du Data Mining and Machine Learning in Cybersecurity // Auerbach Publications p. 57-61.

A. Paprotny, M. Thess Realtime Data Mining. Self Learning Techniques for Recommendation Engines // Springer International Publishing Switzerland, 2013.

Splunk Enterprise Documentation //

DARPA Intrusion Detection Evaluation Datasets (1998) //

Pedregosa et al. Scikit-learn: Machine Learning in Python // Journal of Machine Learning Vol. 12. 2011 p. 2825-2830.

L. Nakhleh Data Clustering // Coursera Algorithmic Thinking Part 1. Rice University, Department of Computer Science.


  • There are currently no refbacks.

Abava  Absolutech Convergent 2020

ISSN: 2307-8162