Applying a probabilistic algorithm to spam filtering

Olga V. Okhlupina, Dmitry S. Murashko


Among the common methods of combating spam, a special place is occupied by a probabilistic machine learning algorithm, which is based on the well-known Bayes theorem. The so-called "naive" Bayesian classifier establishes the class of the document by determining the a posteriori maximum. With the development of machine learning methods, the Bayesian algorithm has not lost its relevance and continues to be very popular for solving a large number of tasks, including spam detection. The main advantages of this classifier are simplicity, fast learning, fairly high accuracy, reliability. The paper considers the solution of the problem of determining spam messages using a probabilistic machine learning algorithm. The mathematical justification and implementation of the Bayesian algorithm on a concrete example using program code in the Python programming language is given

Full Text:



V. E. Gmurman, Teoriya veroyatnostej i matematicheskaya statistika: uchebnoe posobie dlya vuzov. 11 izd. M.: Vysshaya shkola, 2005. 479 p. (In Russian)

Vysokourovnevyj yazyk programmirovaniya Python [Online]. Available:

D. Barber, Bayesian reasoning and machine learning. Cambridge University Press, 2012. 642 p.

O.V. Ohlupina, A.A. Prokopenko, A.O. Zgonnikova, O yomkosti modeli klassifikacii // Uchyonye zapiski Bryanskogo gosudarstvennogo universiteta. Bryansk: BGU, 2021 (4). pp. 22-27. (In Russian)


  • There are currently no refbacks.

Abava  Absolutech Convergent 2020

ISSN: 2307-8162