Analysis and data processing systems

ANALYSIS AND DATA PROCESSING SYSTEMS

Print ISSN: 2782-2001          Online ISSN: 2782-215X
English | Русский

Recent issue
№2(98) April - June 2025

Study of the issues of methods for determining the type of content in incoming traffic

Issue No 4 (92) October - December 2023
Authors:

Reva Ivan L.,
Medvedev Mihail A.,
Vorontsova Inna V.
DOI: http://dx.doi.org/10.17212/2782-2001-2023-4-69-84
Abstract

Content filtering in the context of cybersecurity and trusted environments is an important tactic used to ensure network security and functionality. It works by restricting access to certain websites, emails, files or other content that may contain harmful elements or pose a significant risk of infection. Content filtering ensures the security not only of an individual user's data, but also of an entire network of organizations and institutions, helping to minimize the risk of malicious security breaches.



The study of methods for determining the type of content in incoming traffic is a relevant and important area in the field of information security and network analytics. In today's Internet space, a significant amount of data is transmitted through networks, and one of the key tasks is the classification of this traffic to ensure security and effective network management. Methods for determining the type of content in incoming traffic are a set of algorithms and approaches that allow you to automatically determine what type of data is transmitted over the network. In the course of studying the problems of methods for determining the type of content in incoming traffic, data on network traffic is collected, a data set is selected for training the model, we consider classifier algorithms and focus on metrics for assessing classification efficiency.



The results of the study can be used to create effective systems for detecting malicious or unwanted content, filtering data, or optimizing the operation of network resources. The study of methods for determining the type of content in incoming traffic is of practical importance and can be applied in various fields, including information security, network analytics, monitoring of network resources and optimization of network processes


Keywords: artificial intelligence, computer networks, semantic content filtering, network traffic, dataset, cybersecurity, True-traffic, False-traffic, true positive, true negative, false positive, false negative, precision, recall, F-measure

References

1. Skurichina M., Duin R.P.W. Limited bagging, boosting and the random subspace method for linear classifiers. Pattern Analysis and Applications, 2002, vol. 5 (2), pp. 121–135.



2. Reva I.L., Ivanov A.V., Medvedev M.A., Ognev I.A. Sravnitel'nyi analiz sovremennykh trendov v oblasti modelei trafika setei peredachi dannykh [Comparative analysis of modern trends in the field of traffic models of data transmission networks]. Sistemy analiza i obrabotki dannykh = Analysis and Data Processing Systems, 2022, no. 2 (86), pp. 55–68.



3. Pustejovsky J., Stubbs A. Natural language annotation for machine learning. Sebastopol, CA, O'Reilly Media, 2013. 326 p.



4. Medvedev M.A., Reva I.L. Analiz podkhodov k fil'tratsii trafika i effektivnost' primeneniya chernykh i belykh spiskov [Analysis of traffic filtering approaches and the effectiveness of blacklisting and whitelisting]. Vestnik SibGUTI = The Herald of the Siberian State University of Telecommunications and Information Science, 2023, vol. 17 (1), pp. 107–116.



5. UserGate Web Filter. Rukovodstvo administrator [UserGate Web Filter. Administrator's Guide]. Available at: https://www.rosnovotech.ru/files/UGWF-administrator-manual-ru.pdf (accessed 30.11.2023).



6. Abe N. Recent developments in the theory and applications of machine learning. Journal Japanese Society for Artificial Intelligence, 1999, vol. 14 (5), p. 762. Available at: https://www.ai-gakkai.or.jp/en/published_books/journals_of_jsai/past_journals/in1999/vol14_no5/ (accessed 30.11.2023).



7. Arkhipova A.B., Medvedev M.A., Reutov V.V. Perspektivnost' ispol'zovaniya mashinnogo obucheniya dlya klassifikatsii setevogo trafika v tekhnologiyakh doverennogo vzaimodeistviya [The perspective of using machine learning to classify network traffic in trusted interaction technologies]. Bezopasnost' tsifrovykh tekhnologii = Digital Technology Security, 2022, no. 3 (106), pp. 49–61.



8. Liu J., Tian Z., Zheng R., Liu L. A distance-based method for building an encrypted malware traffic identification framework. IEEE Access, 2019, vol. 7, pp. 100014–100028. DOI: 10.1109/ACCESS.2019.2930717.



9. Rokach L., Maimon O., eds. Data mining and knowledge discovery handbook. 2nd ed. Springer, 2005. Available at: https://tanthiamhuat.files.wordpress.com/2015/04/data_mining_and_knowledge_discovery_handbook.pdf (accessed 30.11.2023).



10. Strekalov I.E., Novikov A.A., Lopatin D.V. Metody dinamicheskoi fil'tratsii veb-kontenta [Methods of web-content dynamic filtering]. Vestnik rossiiskikh universitetov. Matematika = Russian Universities Reports. Mathematics, 2014, vol. 19 (2), pp. 668–669. Available at: https://cyberleninka.ru/article/n/metody-dinamicheskoy-filtratsii-veb-kontenta\ (accessed 30.11.2023).



11. Bruch J., Hinz O., Reinking J. A process model for requirements engineering of digital ecosystems. Empirical Validation in the Industrial Context. Requirements Engineering, 2018, vol. 23 (1), pp. 49–66.



12. Babenko A.A., Bakhracheva Yu.S., Aleeva A.R. Sistema fil'tratsii nezhelatel'nykh prilozhenii internet-resursov [System for filtering unwanted applications of internet resources]. NBI tekhnologii = NBI Technologies, 2020, vol. 14 (4). Available at: https://cyberleninka.ru/article/n/sistema-filtratsii-nezhelatelnyh-prilozheniy-internet-resursov (accessed 30.11.2023).



13. Breiman L. Bagging predictors. Machine Leaning, 1996, vol. 24 (2), pp. 123–140.



14. Afuwape A.A., Xu Y., Anajemba J.H., Srivastava G. Performance evaluation of secured network traffic classification using a machine learning approach. Computer Standards and Interfaces, 2021, vol. 78. DOI: 10.1016/j.csi.2021.103545.



15. Friedman J.H. Stochastic gradient boosting. Computational Statistics and Data Analysis, 2002, vol. 38 (4), pp. 367–378. DOI: 10.1016/S0167-9473(01)00065-2.

For citation:

Reva I.L., Medvedev M.A., Vorontsova I.V. Issledovanie problematiki metodik opredeleniya tipa kontenta vo vkhodyashchem trafike [Study of the issues of methods for determining the type of content in incoming traffic]. Sistemy analiza i obrabotki dannykh = Analysis and Data Processing Systems, 2022, no. 4 (92), pp. 69-84. DOI: 10.17212/2782-2001-2023-4-69-84.

 

 

Views: 768