Analysis and data processing systems

ANALYSIS AND DATA PROCESSING SYSTEMS

Print ISSN: 2782-2001          Online ISSN: 2782-215X
English | Русский

Recent issue
№4(100) October - December 2025

A method of searching and marking artifacts in images applying detection and segmentation algorithms

Issue No 4 (84) October - December 2021
Authors:

Kitenko Andrey M. ,
DOI: http://dx.doi.org/10.17212/2782-2001-2021-4-7-18
Abstract

The paper explores the possibility of using neural networks to single out target artifacts on different types of documents. Numerous types of neural networks are often used for document processing, from text analysis to the allocation of certain areas where the desired information may be contained. However, to date, there are no perfect document processing systems that can work autonomously, compensating for human errors that may appear in the process of work due to stress, fatigue and many other reasons. In this work, the emphasis is on the search and selection of target artifacts in drawings, in conditions of a small amount of initial data. The proposed method of searching and highlighting artifacts in the image consists of two main parts, detection and semantic segmentation of the detected area. The method is based on training with a teacher on marked-up data for two convolutional neural networks. The first convolutional network is used to detect an area with an artifact, in this example YoloV4 was taken as the basis. For semantic segmentation, the U-Net architecture is used, where the basis is the pre-trained Efficientnetb0. By combining these neural networks, good results were achieved, even for the selection of certain handwritten texts, without using the specifics of building neural network models for text recognition. This method can be used to search for and highlight artifacts in large datasets, while the artifacts themselves may be different in shape, color and type, and they may be located in different places of the image, have or not have intersection with other objects.


Keywords: artificial intelligence, semantic segmentation, computer vision, pattern recognition, neural networks, machine learning, deep learning, object detection
Kitenko Andrey M.
Saint Petersburg, St. Petersburg Federal Research Center of the Russian Academy of Sciences (SPC RAS), 4th lin. V.I., 199178, Russian Federation,
kitenko.andrey@gmail.com
Orcid: 0000-0002-4178-8485

References

1. Bochkovskiy A., Wang C.-Y., Liao H.-Y.M. YOLOv4: optimal speed and accuracy of object detection. arXiv:2004.10934 [cs, eess], 2020.



2. Lin T.-Y., Maire M., Belongie S., Bourdev L., Girshick R., Hays J., Perona P., Ramanan D., Zitnick C.L., Dollár P. Microsoft COCO: common objects in context. arXiv:1405.0312 [cs], 2015.



3. Tan M., Pang R., Le Q.V. EfficientDet: scalable and efficient object detection. arXiv:1911.09070 [cs, eess], 2020.



4. Lin T.-Y., Goyal P., Girshick R., He K., Dollár P. Focal loss for dense object detection. arXiv:1708.02002 [cs], 2018.



5. He K., Gkioxari G. , Dollár P., Girshick R. Mask R-CNN. arXiv:1703.06870 [cs], 2018.



6. Makarychev K., Reddy A., Shan L. Improved guarantees for k-means++ and k-means++ parallel. arXiv:2010.14487 [cs], 2020.



7. tf.keras.layers.Concatenate | TensorFlow Core v2.7.0. Available at: https://www.tensorflow.org/api_docs/python/tf/keras/layers/Concatenate?hl=ru (accessed 29.11.2021).



8. Image segmentation with Monte Carlo Dropout UNET and Keras. 42: A blog on A.I, 2019, 30 October. Available at: https://nchlis.github.io/2019_10_30/page.html (accessed 29.11.2021).



9. Ronneberger O., Fischer P., Brox T. U-Net: convolutional networks for biomedical image segmentation. arXiv:1505.04597 [cs], 2015.



10. Tan M., Le Q.V. EfficientNet: rethinking model scaling for convolutional neural networks. arXiv:1905.11946 [cs, stat], 2020.



11. tf.keras.layers.Conv2D | TensorFlow Core v2.7.0. Available at: https://www.tensorflow.org/api_docs/python/tf/keras/layers/Conv2D (accessed 29.11.2021).



12. tf.keras.layers.BatchNormalization | TensorFlow Core v2.7.0. Available at: https://www.tensorflow.org/api_docs/python/tf/keras/layers/BatchNormalization?hl=ru (accessed 29.11.2021).



13. tf.keras.activations.relu | TensorFlow Core v2.7.0. Available at: https://www.tensorflow.org/api_docs/python/tf/keras/activations/relu?hl=ru (accessed 29.11.2021).



14. tf.keras.layers.UpSampling2D | TensorFlow Core v2.7.0. Available at: https://www.tensorflow.org/api_docs/python/tf/keras/layers/UpSampling2D?hl=ru (accessed 29.11.2021).



15. He K., Zhang X., Ren Sh., Sun J. Deep residual learning for image recognition. arXiv:1512.03385 [cs], 2015.



16. Huang G., Liu Z., Maaten L. van der, Weinberger K.Q. Densely connected convolutional networks. arXiv:1608.06993 [cs], 2018.



17. Tan M., Chen B., Pang R., Vasudevan V., Sandler M., Howard A., Le Q.V. MnasNet: platform-aware neural architecture search for mobile. arXiv:1807.11626 [cs], 2019.



18. Sandler M., Howard A., Zhu M., Zhmoginov A., Chen L. MobileNetV2: inverted residuals and linear bottlenecks. arXiv:1801.04381 [cs], 2019.



19. Papers with Code – CEDAR Signature Dataset. Available at: https://paperswithcode.com/dataset/cedar-signature (accessed 29.11.2021).



20. Abdallah A., Hamada M., Nurseitov D. Attention-based fully gated CNN-BGRU for Russian handwritten text. Journal of Imaging, 2020, vol. 6, no. 12, p. 141.



21. GitHub – openvinotoolkit/cvat: powerful and efficient computer vision annotation tool (CVAT). Available at: https://github.com/openvinotoolkit/cvat (accessed 29.11.2021).



22. Pokhrel S. Image data labelling and annotation – everything you need to know. Available at: https://towardsdatascience.com/image-data-labelling-and-annotation-everything-you-need-to-know-86ede6c684b1 (accessed 29.11.2021).



23. GitHub – AlexeyAB/darknet: YOLOv4 / Scaled-YOLOv4 / YOLO – Neural Networks for Object Detection (Windows and Linux version of Darknet). Available at: https://github.com/AlexeyAB/darknet (accessed 29.11.2021).



24. Yohanandan S. mAP (mean Average Precision) might confuse you! Available at: https://towardsdatascience.com/map-mean-average-precision-might-confuse-you-5956f1bfa9e2 (accessed 29.11.2021).



25. Module: tf.keras | TensorFlow Core v2.7.0. Available at: https://www.tensorflow.org/api_docs/python/tf/keras?hl=ru (accessed 29.11.2021).



26. tf.keras.optimizers.Adam | TensorFlow Core v2.7.0. Available at: https://www.tensorflow.org/api_docs/python/tf/keras/optimizers/Adam?hl=ru (accessed 29.11.2021).



27. Zulkifli H. Understanding learning rates and how it improves performance in deep learning. Available at: https://towardsdatascience.com/understanding-learning-rates-and-how-it-improves-performance-in-deep-learning-d0d4059c1c10 (accessed 29.11.2021).



28. Segmentation models Python API – segmentation models 0.1.2 documentation. Available at: https://segmentation-models.readthedocs.io/en/latest/api.html#losses (accessed 29.11.2021).



29. tf.keras.metrics.MeanIoU | TensorFlow Core v2.7.0. Available at: https://www.tensorflow.org/api_docs/python/tf/keras/metrics/MeanIoU?hl=ru (accessed 29.11.2021).

Acknowledgements. Funding

The presented research results were carried out within the framework of budget topic No. 0073-2019-0005 (2019-2021).

Просмотров аннотации: 615
Скачиваний полного текста: 735
Просмотров интерактивной версии: 0
For citation:

Kitenko A.M. Metod poiska i razmetki artefaktov na izobrazheniyakh s ispol'zovaniyem algoritmov detektsii i segmentatsii [A method of searching and marking artifacts in images applying detection and segmentation algorithms]. Sistemy analiza i obrabotki dannykh = Analysis and Data Processing Systems, 2021, no. 4 (84), pp. 7–18. DOI: 10.17212/2782-2001-2021-4-7-18.