Analysis of the possibilities of using machine learning technologies to detect attacks on web applications

Е.А. Yudova, Olga R. Laponina


This article is devoted to the analysis of the possibility of detecting attacks on web applications using machine learning algorithms. Supervised learning is considered. A sample of HTTP DATASET CSIC 2010 is used as a data set. The dataset was automatically generated and contains 36,000 normal queries and over 25,000 anomalous. All HTTP requests are marked as normal or abnormal. The anomalous data contains attacks such as SQL injection, buffer overflow, information gathering, file expansion, CRLF injection, cross-site scripting (XSS), server-side inclusion, parameter spoofing, etc. The training and test samples are selected to analyze the effectiveness of various machine learning algorithms used for traffic classification. Conversion of all text values of attributes to numerical ones was realized. The quality metrics for five machine learning algorithms are determined, and the optimal algorithm is selected that classifies the traffic under consideration into abnormal and normal.

Full Text:

PDF (Russian)


Raschka S. Python machine learning. — Packt publishing ltd, 2015.

Kostas K. Anomaly Detection in Networks Using Machine Learning // Research Proposal. — 2018. — t. 23.

Siles R. Session Management Cheat Sheet // URL php/Session_Management_Cheat_Sheet. — 2014.

Scambray J., Shema M. Hacking Exposed Web Applications. — Brandon A. Nordin, 2002. — s. 416.

Gimenez C. T., Villegas A. P., Maran˜on G. A. HTTP data set CSIC 2010 // Information Security Institute of CSIC (Spanish Research National Council). – 2010.

Kaftannikov I. L., Parasich A. V. Osobennosti primenenija derev'ev reshenij v zadachah klassifikacii // Vestnik Juzhno-Ural'skogo gosudarstvennogo universiteta. Serija: Komp'juternye tehnologii, upravlenie, radiojelektronika. — 2015. — t. 15, # 3.

Chistjakov S. P. Sluchajnye lesa: obzor // Trudy Karel'skogo nauchnogo centra Rossijskoj akademii nauk. — 2013. — # 1.

Vinogradova E., Golovin E. Metriki kachestva algoritmov mashinnogo obuchenija v zadachah klassifikacii // Nauchnaja sessija GUAP. — 2017. — s. 202—206.

Lantz B. Machine learning with R. — Packt publishing ltd, 2013.

AbellanJ., MasegosaA.R. Bagging decision trees on data sets with classification noise // International Symposium on Foundations of Information and Knowledge Systems. — Springer. 2010. — s. 248—265.

DruckerH.,CortesC. Boosting decision trees //Advances in neural information processing systems. — 1996. — s. 479—485.

Paklin N. Logisticheskaja regressija i ROC-analiz - matematicheskij apparat // Rezhim dostupa: https://basegroup. ru/community/articles/logistic. Data dostupa. — 2015. — t. 9.

OWASP Zed Attack Proxy (ZAP). (2021).

Andrés Riancho: Web Application Attack and Audit Framework. (2007).

Kate Yudova Diploma


  • There are currently no refbacks.

Abava  Absolutech Convergent 2020

ISSN: 2307-8162