Digital technology security

DIGITAL TECHNOLOGY SECURITY

DIGITAL
TECHNOLOGY SECURITY

English | Русский

Recent issue
№2(2025) April - June 2025

Investigation of the fault tolerance of the NSM infrastructure

Issue No 3-4 (96) July - December 2019
Authors:

Komissarov Valerij V.,
Tingajkin Denis O.
DOI: http://dx.doi.org/10.17212/2307-6879-2019-3-4-106-121
Abstract

The work will consider open source software - the Network Service Mesh project developed by Cisco, with the assistance of Red Hat, Huawei, Intel, Vmware. This project extends the networking capabilities of Kubernetes software, namely it allows you to dynamically configure and create network interfaces between multiple PODs. The classic connection setup case is when the client and destination PODs are on the same node. PODs can also be on different nodes.



In this paper, we tested the scenarios for establishing connections between PODs located on different nodes, and also tested the scenarios when one or more of the key infrastructure components failed.



In this work, fault tolerance is understood to mean that the system remains operational in the event of failure of one or more of the main NSM infrastructure components. The NSM infrastructure includes the following components: service manager (NSM), forwarding agent (Forwarder), as well as auxiliary containers for the service manager such as: client to the Kubernetes registry (nsmd-k8s), client for the device plug-in (nsmdp).



Resiliency research has shown that the project supports 7 out of 9 proven scenarios. It was found that at the moment there is no solution for the problem when the unix socket files used for communication between components are damaged and the system is unable to detect this problem, which leads to its inoperability. It was also found that the system is not able to notify the client of a crash when all the key components except the client and the endpoint have failed. It was suggested that monitoring be used to solve these problems.


Keywords: Kubernetes, DPAPI, network interfaces, POD, L2/L3 service manager, fault tolerance, monitoring, gRPC, connection establishment process

References

1. Network Service Mesh. The Hybrid/Multi-cloud IP Service Mesh: website. Available at: https://networkservicemesh.io/ (accessed 18.12.2019).



2. Lukša M. Kubernetes in action. Shelter Island, NY, Manning Publications Co., 2018 (Russ. ed.: Luksha M. Kubernetes v deistvii. Moscow, DMK Press Publ., 2018. 672 p.).



3. Stevens W.R., Rago S.A. Advanced programming in the UNIX environment. 3rd ed. Upper Saddle River, NJ, Addison-Wesley, 2013 (Russ. ed.: Stivens U.R., Rago S.A. UNIX. Professional'noe programmirovanie. 3rd ed. St. Petersburg, Piter Publ., 2018. 944 p.).



4. Zeliger N.B., Chugreev O.S., Yanovskii G.G. Proektirovanie setei i sistem peredachi diskretnykh soobshchenii [Design of networks and systems for the transmission of discrete messages]. Moscow, Radio i svyaz' Publ., 2015. 176 p.



5. Kul'gin M. Tekhnologii korporativnykh setei [Technologies of corporate networks]. St. Petersburg, Piter Publ., 2019. 704 p.



6. Ratliff B., Ballard J. Microsoft Internet security and acceleration (ISA) server 2004: administrator's pocket consultant. Redmond, Microsoft Press, 2006 (Russ. ed.: Ratliff B., Ballard D. Microsoft Internet security and acceleration (ISA) server 2004: spravochnik administratora. Moscow, Russkaya redaktsiya Publ., 2017. 400 p.).



7. Sportak M.A., Pappas F.Ch., Renzing E. Komp'yuternye seti. Kn. 1. High-Perfomance Networking [Computer networks. Bk. 1. High-Performance Networking]. Moscow, DiaSoft Publ., 2016. 432 p. (In Russian).



8. Stallings W. High-speed networks and internets: performance and quality of service. Upper Saddle River, Prentice Hall, 2002 (Russ. ed.: Stollings V. Sovremennye komp'yuternye seti. St. Petersburg, Piter Publ., 2017. 783 p.).



9. Zwicky E., Cooper S., Chapman B. Building Internet Firewalls. 2nd ed. Beijing, Cambridge, O'Reilly, 2000 (Russ. ed.: Tsviki E., Kuper S., Chapmen B. Sozdanie zashchity v internete. 2nd ed. Moscow, St. Petersburg, Simvol-Plyus Publ., 2019. 928 p.).



10. Shcherbo V.K., Kireichev V.M., Samoilenko S.I. Standarty po lokal'nym vychis-litel'nym setyam [Standards for local area networks]. Moscow, Radio i svyaz' Publ., 2015. 304 p.



11. Applegate I. How process isolation became viable for production deployment. Available at: http://containerjournal.com/2016/02/11/how-process-isolation-became-viable-for-production-deployment/ (accessed 18.12.2019).



12. Grant B. Kubernetes design and architecture. Available at: https://github.com/kubernetes/community/blob/master/contributors/design-proposals/architecture/architecture.md (accessed 18.12.2019).



13. Open Container Initiative. Runtime Specification: website. Available at: https://github.com/opencontainers/runtime-spec (accessed 18.12.2019).



14. Turnbull J. The Docker Book. Available at: https://dockerbook.com/ (accessed 18.12.2019).



15. Kubernetes official documentation. Available at: https://kubernetes.io/ (accessed 18.12.2019).



16. Open Container Initiative. Image Specification. Available at: https://github.com/opencontainers/image-spec (accessed 18.12.2019).

For citation:

Komissarov V.V., Tingajkin D.O. Issledovanie otkazoustoichivosti infrastruktury NSM [Investigation of the fault tolerance of the NSM infrastructure]. Sbornik nauchnykh trudov Novosibirskogo gosudarstvennogo tekhnicheskogo universitetaTransaction of scientific papers of the Novosibirsk state technical university, 2019, no. 3–4 (96), pp. 106–121. DOI: 10.17212/2307-6879-2019-3-4-106-121.

Views: 1906