Intelligent multimodal neural network activity monitoring system

R. Minneakhmetov

Abstract


An approach to creating an intelligent activity monitoring system based on large language models is proposed. Special attention is paid to the use of modern neural networks and computer vision methods for complex analysis of video surveillance data, sensor signals and event logs. The local Ollama framework has been chosen as the implementation platform, which allows large language models to be run independently. A prototype of the system has been developed; its architecture, the process of processing heterogeneous data, and the results of an experimental evaluation are described. The results show that the use of multiple neural network models makes it possible to automate the analysis of multimodal data and increases the accuracy of anomaly detection in the scenarios under consideration.


Full Text:

PDF (Russian)

References


E. Ferrara, “Large Language Models for Wearable Sensor-Based Human Activity Recognition, Health Monitoring, and Behavioral Modeling,” Sensors, vol. 24, no. 15, p. 5045, 2024.

OpenAI, “ChatGPT-4o-mini,” [Online]. Available: https://chatgpt.com/. Accessed: Mar. 30, 2025.

A. V. Pyataeva, M. A. Merko, V. A. Zhukovskaya, and A. A. Kazakevich, “Recognition of human activity from video data,” International Journal of Advanced Studies, vol. 12, No. 4, pp. 96-110, 2022.

R. Sharma and N. Patel, “Deep learning-based anomaly detection in surveillance videos,” J. Vis. Commun. Image Represent., vol. 86, p. 103624, 2022.

I. V. Kotenko, O. V. Polubelova, I. B. Sayenko, and A. A. Chechulin, “Application of ontologies and logical inference for managing information and security events,” High Availability Systems, vol. 8, No. 2, pp. 100-108, 2012.

B. Nour, M. Pourzandi, and M. Debbabi, “A Survey on Threat Hunting in Enterprise Networks,” IEEE Commun. Surveys Tuts., vol. 25, pp. 2299–2324, 2023. doi: 10.1109/COMST.2023.3299519.

S. Suh, V. F. Rey, and P. Lukowicz, “Tasked: Transformer-based adversarial learning for human activity recognition using wearable sensors,” Knowl.-Based Syst., vol. 260, p. 110143, 2023.

S. Gupta, “Deep learning-based human activity recognition using wearable sensor data,” Int. J. Inf. Manag. Data Insights, vol. 1, p. 100046, 2021.

N. D. Nath, A. H. Behzadan, and S. G. Paal, “Deep learning for site safety: Real-time detection of personal protective equipment,” Autom. Constr., vol. 112, p. 103085, 2020.

S. Han, S. Yuan, and M. Trabelsi, “LogGPT: Log Anomaly Detection via GPT,” arXiv preprint, 2023. [Online]. Available: https://arxiv.org/pdf/2309.14482.

Ollama, “llava:13b Model,” [Online]. Available: https://ollama.com/library/llava:13b. Accessed: Mar. 30, 2025.

Ollama, “llama3.2-vision:11b Model,” [Online]. Available: https://ollama.com/library/llama3.2-vision. Accessed: Mar. 30, 2025.

A. Uçar, M. Karakoşe, and N. Kırımça, “Artificial Intelligence for Predictive Maintenance Applications: Key Components, Trustworthiness, and Future Trends,” Appl. Sci., vol. 14, no. 2, p. 898, 2024.

S. Özüağ and Ö. Ertuğrul, “Enhanced Occupational Safety in Agricultural Machinery Factories: Artificial Intelligence-Driven Helmet Detection Using Transfer Learning and Majority Voting,” Appl. Sci., vol. 14, p. 11278, 2024. doi: 10.3390/app142311278.

X. Li, Y. Chen, and L. Hu, “Real-time workplace activity recognition using deep learning models,” IEEE Trans. Ind. Inf., vol. 19, no. 2, pp. 1520–1532, 2023.

Z. Wu, J. Zhao, and H. Shen, “Smart home automation based on human activity recognition: A survey,” Future Gener. Comput. Syst., vol. 137, pp. 41–57, 2023.

S. Yadav, C. K. Jha, and N. Kumar, “AI-powered fall detection systems for elderly care: Challenges and future directions,” Comput. Methods Programs Biomed., vol. 230, p. 107416, 2024.

ISO, “ISO 8601-1:2019 Standard,” [Online]. Available: https://www.iso.org/obp/ui/#iso:std:iso:8601:-1:ed-1:v1:en. Accessed: Mar. 30, 2025.

Ollama, “API Documentation,” [Online]. Available: https://github.com/ollama/ollama/blob/main/docs/api.md. Accessed: Mar. 30, 2025.

Ollama, [Online]. Available: https://ollama.com/. Accessed: Mar. 30, 2025.

Ollama, “Python Library,” [Online]. Available: https://github.com/ollama/ollama-python. Accessed: Mar. 30, 2025.

Ollama, “gemma3:12b Model,” [Online]. Available: https://ollama.com/library/gemma3:12b. Accessed: Mar. 30, 2025.

Ollama, “minicpm-v:8b Model,” [Online]. Available: https://ollama.com/library/minicpm-v. Accessed: Mar. 30, 2025.

P. Sahoo, A. K. Singh, S. Saha, V. Jain, S. Mondal, and A. Chadha, “A Systematic Survey of Prompt Engineering in Large Language Models: Techniques and Applications,” arXiv preprint, 2024. [Online]. Available: https://arxiv.org/pdf/2402.07927.

D. J. Hand and P. Christen, “F*: an interpretable transformation of the F-measure,” J. Classification, vol. 38, no. 1, pp. 3–17, 2021.

Scikit-learn, “F1-Score,” [Online]. Available: https://scikit-learn.org/stable/modules/generated/sklearn.metrics.f1_score.html. Accessed: Mar. 30, 2025


Refbacks

  • There are currently no refbacks.


Abava  Кибербезопасность ИТ конгресс СНЭ

ISSN: 2307-8162