The article considers the problem of regulator synthesis using neural networks on the example of the problem of two-channel object stabilization. The object is a three-mass system which includes six integrators. Two inputs are used to control the facility and stabilization must be achieved on two output channels. In the task at hand, the object is initially in an unstable position. A neural network which is proposed to be trained using one of the reinforcement learning methods is used as a regulator. The Deterministic Policy Gradient method allowed us to obtain the best results. The task in question is considered in two versions. The peculiarity of the first version is the availability of the object state vector. In this case, in contrast to the classical method for solving the problem of regulator synthesis, the input of the neural network, which is the regulator, is the state vector of the object (integrator values). In the second version of the problem, we believe that the state vector of the object is unavailable and an observer implemented by a neural network is used to evaluate it. An observer synthesis technique is proposed, as well as the structure of the observer’s neural network, which consists of the first recurrent layer and further the direct distribution network. To train the observer’s neural network, data was collected through a series of experiments with a circuit with a parallel connection of the models of the object and the observer and a subsequent application of a random control law to them. Graphs for assessing the object state and the observer’s estimates are given. The controller neural network is trained to stabilize from the initial unstable state. The article ends with conclusions and considerations concerning further research.
1. Krizhevsky A., Sutskever I., Hinton G.E. ImageNet classification with deep convolutional neural networks. Advances in Neural Information Processing Systems 25 (NIPS 2012), Lake Tahoe, Nevada, 2012, pp. 1097–1105.
2. Graves A., Mohamed A., Hinton G. Speech recognition with deep recurrent neural networks. Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2013, Vancouver, Canada, 2013, pp. 6645–6649.
3. Deng L., Hinton G.E., Kingsbury B. New types of deep neural network learning for speech recognition and related applications: an overview. Proceedings of the IEEE International Conference on Acoustic, Speech, and Signal Processing, ICASSP 2013. Vancouver, Canada, 2013, pp. 8599–8603.
4. Voevoda A.A., Romannikov D.O. Sintez neironnoi seti dlya resheniya logiko-arifmeticheskikh zadach [Synthesis of neural network for solving logical-arithmetic problems]. Trudy SPIIRAN – SPIIRAS proceedings, 2017, vol. 54, no. 5, pp. 205–223.
5. Voevoda A.A., Romannikov D.O. Sintez neironnoi seti dlya realizatsii rekurrentnogo metoda naimen'shikh kvadratov [Synthesis of a neural network for the implementation of the recursive least-squares method]. Nauchnyi vestnik Novosibirskogo gosudarstvennogo tekhnicheskogo universiteta – Science bulletin of the Novosibirsk state technical university, 2018, no. 72, pp. 33–42.
6. Voevoda A.A., Shoba E.V. Stabilizatsiya trekhmassovoi sistemy: modal'nyi metod sinteza v prostranstve sostoyanii s nablyudatelem ponizhennogo poryadka [Stabilisation of three-mass system: a modal method of synthesis in state space with reduced-order observer]. Sbornik nauchnykh trudov Novosibirskogo gosudarstvennogo tekhnicheskogo universiteta – Transaction of scientific papers of the Novosibirsk state technical university, 2010, no. 4 (62), pp. 13–24.
7. Sutton R., Barto A. Reinforcement learning: an introduction. Cambridge, MIT Press, 2018. 1328 p.
8. Mnih V., Kavukcuoglu K., Silver D., Graves A., Antonoglou I., Wierstra D., Riedmiller M. Playing Atari with deep reinforcement learning. NIPS Deep Learning Workshop, Lake Tahoe, 2013.
9. Hester T., Vecerik M., Pietquin O., Lanctot M., Schaul T., Piot B., Horgan D., Quan J., Sendonaris A., Dulac-Arnold G., Osband I., Agapiou J., Leibo J.Z., Gruslys A. Learning from demonstrations for real world reinforcement learning. Proceeding of the Thirtieth AAAI Conference on Artificial Intelligence, AAAI'16, Phoenix, Arizona, 2016, pp. 2094–2100.
10. Silver D., Huang A., Maddison C., Guez A., Sifre L., Driessche G., Schrittwieser J., Antonoglou I., Panneershelvam V., Lanctot M., Dieleman S., Grewe D., Nham J., Kalchbrenner N., Sutskever I., Lillicrap T., Leach M., Kavukcuoglu K., Graepel T., Hassabis D. Mastering the game of Go with deep neural networks and tree search. Nature, 2016, vol. 529, pp. 484–503.
11. Omid E., Netanyahu N., Wolf L. DeepChess: end-to-end deep neural network for automatic learning in chess. Artificial Neural Networks and Machine Learning – ICANN 2016: 25th International Conference on Artificial Neural Networks: proceedings. Springer International Publishing Switzerland, 2016, pt. 2, pp. 88–96.
12. Makarov I.M., Lokhin V.M., eds. Intellektual'nye sistemy avtomaticheskogo upravleniya [Intelligent automatic control systems]. Moscow, Fizmatlit Publ., 2001. 576 p.
13. Belov M.P., Chan D.Kh. Intellektual'nyi kontroller na osnove nelineinogo optimal'nogo upravleniya robotami-manipulyatorami [Intelligent controller based on non-linear optimal control of robotic manipulators]. Izvestiya SPbGETU "LETI" – Proceedings of Saint Petersburg Electrotechnical University, 2018, no. 9, pp. 76–86.
14. Alvarado R., Valdovinos L., Salgado-Jiménez T., Gómez-Espinosa A., Fonseca-Navarro F. Neural network-based self-tuning PID control for underwater vehicles. Sensors, 2016, vol. 16 (9), p. 1429.
15. Kumar R., Srivastava S., Gupta J.R.P. Artificial Neural Network based PID controller for online control of dynamical systems. 2016 IEEE 1st International Conference on Power Electronics, Intelligent Control and Energy Systems (ICPEICES), Delhi, 2016.
16. Zribi A., Chtourou M., Djemel M. A new PID neural network controller design for nonlinear processes. Available at: http://arxiv.org/abs/1512.07529 (accessed 05.12.2019).
17. Wawrzynski P. A simple actor-critic algorithm for continuous environments. Proceedings of the 10th IEEE International Conference on Methods and Models in Automation and Robotics, Miedzyzdroje, Poland, 2004, pp. 1143–1149.
18. Silver D., Lever G., Heess N., Degris T., Wierstra D., Riedmiller M. Deterministic policy gradient algorithms. Proceedings of the 31st International Conference on Machine Learning, ICML'1432, 2014, vol. 32, pp. 387–395.
Voevoda A.A., Romannikov D.O. Sintez regulyatorov dlya mnogokanal'nykh sistem s is-pol'zovaniem neironnykh setei [Synthesis of regulators for multichannel systems using neural networks]. Nauchnyi vestnik Novosibirskogo gosudarstvennogo tekhnicheskogo universiteta – Science bulletin of the Novosibirsk state technical university, 2019, no. 4 (77), pp. 7–16. DOI: 10.17212/1814-1196-2019-4-7-16.