D.A. Hnatiuk
Èlektron. model. 2025, 47(6):11-33
https://doi.org/10.15407/emodel.47.06.011
ABSTRACT
The use of machine learning methods for detecting anomalies in server-based software systems operating in real time is analyzed. In particular, LSTM for analyzing event logs and XGBoost for classifying structured event features. A systematization of modern methods of monitoring and analyzing event logs based on the use of machine learning methods is carried out. The advantages and disadvantages of individual methods are determined, and the effectiveness of their combined use for detecting anomalies in server software systems is substantiated. Particular attention is paid to the optimization of machine learning methods for detecting non-standard events in logging. They use mechanisms of attention, data caching, and methods of automated feature detection, which allow for real-time analysis of the event stream. The results of the analysis confirm the high potential of hybrid models for improving the stability, reliability, and performance of server software systems, allowing us to outline promising areas for further research in the field of anomaly detection.
KEYWORDS
anomaly monitoring methods, event logs, combined approach for anomaly detection, caching, high-load environments.
REFERENCES
- Zha, D., Bhat, Z., Lai, K.-H., Yang, F., Jiang, Z., Zhong, S., & Hu, X. (2023). Data-centric artificial intelligence: A survey. 38. https://doi.org/10.48550/arXiv.2303.10158
- Provatas, N., Konstantinou, I., & Koziris, N. (2025). A survey on parameter server architecture: Approaches for optimizing distributed centralized learning. https://doi.org/10.1109/ ACCESS.2025.3535085
- Zhang, H., Zhou, Y., Xu, H., Shi, J., Lin, X., & Gao, Y. (2025). Anomaly detection in virtual machine logs against irrelevant attribute interference. https://doi.org/10.1371/journal.pone.0315897
- Sinha, R., Sur, R., Sharma, R., & Shrivastava, A. (2022). Anomaly detection using system logs: A deep learning approach. https://doi.org/10.4018/IJISP.285584
- Hnatiuk, D.A. (2024). Osoblyvosti navchannia modelei dlia efektyvnoho analizu danykh i vyiavlennia anomalii v servernykh prohramnykh systemakh. U Information modeling technologies, systems and applications (s. 85). https://fotius.cdu.edu.ua/wp-content/uploads/ 2024/05/Book_IMTCK_2024.pdf
- Zhang, X., & Zhang, Q. (2020). Short-Term traffic flow prediction based on lstm-xgboost combination model. Cmes, 125(1), 95-109. https://doi.org/10.32604/cmes.2020.011013
- Shi, Z., Hu, Y., Mo, G., & Wu, J. (2023). Attention-based CNN-LSTM and XGBoost hybrid model for stock prediction. Journal of Latex Class Files, 14(8). https://doi.org/10.48550/arXiv.2204.02623
- Vervaet, A. (2023). MoniLog: An automated log-based anomaly detection system for cloud computing infrastructures. 5. https://arxiv.org/pdf/2304.11940
- Ede, T.V., Aghakhani, H., Spahn, N., Bortolameotti, R., Cova, M., & Continella, A. (2022). DEEPCASE: Semi-supervised contextual analysis of security events. U IEEE symposium on security and privacy (SP) (s. 522-539). https://doi.org/10.1109/SP46214. 2022.9833671
- Torres, L., Barrios, H., & Denneulin, Y. (2024). Evaluation of computational and energy performance in matrix multiplication algorithms on CPU and GPU using MKL, cuBLAS and SYCL. 14. https://doi.org/10.48550/arXiv.2405.17322
- Guo, H., Yuan, S., & Wu, X. (2021). LogBERT: Log anomaly detection via BERT. 13. https://doi.org/10.48550/arXiv.2103.04475
- Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., & Polosukhin, I. (2017). Attention is all you need. 15. https://doi.org/10.48550/arXiv.1706.03762
- Sanh, V., Debut, L., Chaumond, J., & Wolf, T. (2019). DistilBERT, a distilled version of BERT: Smaller, faster, cheaper and lighter. 5. https://doi.org/10.48550/arXiv.1910.01108
- Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., & Soricut, R. (2020). ALBERT: A lite BERT for self-supervised learning of language representations. 17. https://doi.org/10.48550/arXiv.1909.11942
- Alizadeh, N., & Castor, F. (2024). Green AI: A preliminary empirical study on energy consumption in DL models across different runtime infrastructures. 6. https://doi.org/10.48550/arXiv.2402.13640
- Pierson, R., & Moin, A. (2025). Automated bug report prioritization in large open-source projects. 10. https://doi.org/10.48550/arXiv.2504.15912
- Kingma, D.P., & Welling, M. (2013). Auto-Encoding variational bayes. 14. https://doi.org/ 10.48550/arXiv.1312.6114
- Zhao, S., Song, J., & Ermon, S. (2017). Towards deeper understanding of variational autoencoding models. 14. https://doi.org/10.48550/arXiv.1702.08658
- Široký, F. (2019). Anomaly detection using deep sparse autoencoders for CERN particle detector data. https://is.muni.cz/th/ljgxi/BcPraceSiroky.pdf
- Dohi, K. (2020). Variational autoencoders for jet simulation. https://doi.org/10.48550/arXiv. 2009.04842
- Tang, T., Yao, J., Wang, Y., Sha, Q., Feng, H., & Xu, Z. (2025). Application of deep generative models for anomaly detection in complex financial transactions. https://doi.org/ 10.48550/arXiv.2504.15491
- Cho, K., van Merrienboer, B., Gulcehre, C., Bahdanau, D., Bougares, F., Schwenk, H., & Bengio, Y. (2014). Learning phrase representations using RNN encoder-decoder for statistical machine translation. https://doi.org/10.48550/arXiv.1406.1078
- Gers, F. (2016). Recurrent nets that time and count. (s. 7). https://doi.org/10.1109/IJCNN.2000.861302
- Wang, Q., Ai, X., Zhang, Y., Chen, J., & Yu, G. (2022). HyTGraph: Gpu-accelerated graph processing with hybrid transfer management. https://doi.org/10.48550/arXiv.2208.14935
- Jayanth, R., Gupta, N., & Prasanna, V. (2024). Benchmarking edge AI platforms for high-performance ML inference. https://doi.org/10.48550/arXiv.2409.14803
- Hundman, K., Constantinou, V., Laporte, C., Colwell, I., & Soderstrom, T. (2018). Detecting spacecraft anomalies using lstms and nonparametric dynamic thresholding. https://doi.org/10.48550/arXiv.1802.04431
- Bao, Q., Wang, J., Gong, H., Zhang, Y., Guo, X., & Feng, H. (2025). A deep learning approach to anomaly detection in high-frequency trading data. https://doi.org/10.48550/arXiv.2504.00287
- Mäntylä, M., Varela, M., & Hashemi, S. (2022). Pinpointing anomaly events in logs from stability testing - n-grams vs. deep-learning. https://doi.org/10.48550/arXiv.2202.09214
- Hochreiter, S. (1997). Long Short-Term Memory. 32. https://doi.org/10.1162/neco. 1997.9.8.1735
- Lindemann, B., Maschler, B., Sahlab, N., & Weyrich, M. (2021). A survey on anomaly detection for technical systems using LSTM networks. https://doi.org/10.48550/ arXiv.2105.13810
- Van Houdt, G., Mosquera, C., & Nápole, G. (2020). A review on the long short-term memory model. Artificial Intelligence Review, 53(1), 14. https://link.springer.com/article/10.1007/s10462-020-09838-1
- Zhao, J., Huang, F., Lv, J., Duan, Y., Qin, Z., Li, G., & Tian, G. (2020). Do RNN and LSTM have Long Memory? https://doi.org/10.48550/arXiv.2006.03860
- Prater, R., Hanne, T., & Dornberger, R. (2024). Generalized Performance of LSTM in Time-Series Forecasting. Journal of Forecasting, 20. https://doi.org/10.1080/08839514.2024.2377510
- Dai, J., Liao, M., & Guo, X. (2023). Research on the application of improved LSTM model in time series problems. U 2023 international conference on electronics, automation, and computer science (ICEACE) (s. 1544-1548). https://doi.org/10.1109/ICEACE60673. 2023.10442927
- Ghislieri, M., Cerone, G.L., Knaflitz, M., & Agostini, V. (2021). Long short-term memory (LSTM) recurrent neural network for muscle activity detection. Journal of NeuroEngineering and Rehabilitation, 15. https://jneuroengrehab.biomedcentral.com/articles/10.1186/s12984-021-00945-w
- Sennhauser, L., & Berwick, R.C. (2018). Evaluating the ability of lstms to learn context-free grammars. U Proceedings of the 32nd conference on neural information processing systems (neurips) (s. 115-124). https://doi.org/10.48550/arXiv.1811.02611
- Karpathy, A., Johnson, J., & Fei-Fei, L. (2015). Visualizing and understanding recurrent networks. https://doi.org/10.48550/arXiv.1506.02078
- Baytas, I.M., Xiao, C., Zhang, X., Wang, F., Jain, A.K., & Zhou, J. (2017). Patient Subtyping via Time-Aware LSTM Networks. U Proceedings of the 2017 ACM international conference on bioinformatics, computational biology, and health informatics (s. 65-74). https://dl.acm.org/doi/10.1145/3097983.3097997
- Meng, W., Liu, Y., Zhu, Y., Zhang, S., Pei, D., Liu, Y., Chen, Y., Zhang, R., Tao, S., Pei, S., & Zhou, R. (2019). LogAnomaly: Unsupervised detection of sequential and quantitative anomalies in unstructured logs. U Proceedings of the 28th international joint conference on artificial intelligence (IJCAI-19) (s. 7). https://doi.org/10.24963/ijcai.2019/658
- Von Kügelgen, J., Sharma, Y., Gresele, L., Brendel, W., Schölkopf, B., Besserve, M., & Locatello, F. (2022). Self-Supervised learning with data augmentations provably isolates content from style. 32. https://doi.org/10.48550/arXiv.2106.04619
- Farzad, A., & Gulliver, T.A. (2019). Log message anomaly detection and classification using auto-b/lstm and auto-gru. 18. https://arxiv.org/abs/1911.08744
- Lundberg, S., & Lee, S.-I. (2017). A unified approach to interpreting model predictions. 10. https://doi.org/10.48550/arXiv.1705.07874
- Chen, T., & Guestrin, C. (2016). XGBoost: A scalable tree boosting system. U Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining (s. 13). https://arxiv.org/abs/1603.02754
- Anghel, A., Papandreou, N., Parnell, T., De Palma, A., & Pozidis, H. (2018). Benchmarking and optimization of gradient boosting decision tree algorithms. 7. https://doi.org/10.48550/arXiv.1809.04559
- XGBoost: Everything you need to know. (b. d.). https://neptune.ai/blog/xgboost-everything-you-need-to-know
- Zhang, J., Wang, R., Jia, A., & Feng, N. (2024). Optimization and application of xgboost logging prediction model for porosity and permeability based on k-means method. 18. https://doi.org/10.3390/app14103956
- Boldini, D., Grisoni, F., Kuhn, D., Friedrich, L., & Sieber, S.A. (2023). Practical guidelines for the use of gradient boosting for molecular property prediction. 13. https://doi.org/10.1186/s13321-023-00743-7
- Frifra, A., Maanan, M., Maanan, H., & Rhinane, H. (2024). Harnessing LSTM and XGBoost algorithms for storm prediction. 13. https://www.nature.com/articles/s41598-024-62182-0
- Fani Sani, M., Vazifehdoostirani, M., Park, G., Pegoraro, M., van Zelst, S.J., & van der Aalst, W.M.P. (2023). Performance-preserving event log sampling for predictive monitoring. J intell inf syst, 53-82. https://doi.org/10.1007/s10844-022-00775-9
- Wang, X., & Lu, X. (2020). A host-based anomaly detection framework using xgboost and LSTM for iot devices. 12. https://doi.org/10.1155/2020/8838571
- Lobachev, I.M. (2021). Modeli ta metody pidvyshchennia efektyvnosti rozpodilenykh transdiusernykh merezh na osnovi mashynnoho navchannia ta peryferiinykh obchyslen. https://op.edu.ua/sites/default/files/publicFiles/dissphd/dysertaciya_lobachev_122.pdf
- XGBoost parameters. (b. d.). https://xgboost.readthedocs.io/en/stable/parameter.html
- Putatunda, S., & Rama, K. (2020). A modified bayesian optimization based hyper-parameter tuning approach for extreme gradient boosting. 6. https://doi.org/10.48550/arXiv.2004.05041
- XGBoost. (b. d.). https://www.nvidia.com/en-us/glossary/xgboost/
- Kukkala, V.K., Thiruloga, S.V., & Pasricha, S. (2021). LATTE: LSTM self-attention based anomaly detection in embedded automotive platforms. 24. https://doi.org/10.48550/arXiv.2107.05561
- AutoML for XGBoost. (b. d.). https://microsoft.github.io/FLAML/docs/Examples/AutoML- for-XGBoost/
- Chai, C., Lu, J., Jiang, X., Shi, X., & Zeng, Z. (2021). An automated machine learning (automl) method for driving distraction detection based on lane-keeping performance. 11. https://doi.org/10.48550/arXiv.2103.08311
- Elastic stack. (b. d.). https://www.elastic.co/elastic-stack