A blind navigation guide model for obstacle avoidance using distance vision estimation based YOLO-V8n

EBERE CHIDI; Edward Anoliefo; Collins Udanor; Asogwa Tochukwu Chijindu; Lois Onyejere Nwobodo

doi:10.46481/jnsps.2025.2292

Authors

Ebere Uzoka Chidi
[email protected]
Department of Electronics Engineering, University of Nigeria, Nsukka, Enugu State Nigeria
Edward Anoliefo Department of Electronics Engineering, University of Nigeria, Nsukka, Enugu State Nigeria
Collins Udanor Department of Computer Science, University of Nigeria, Nsukka, Enugu State, Nigeria
Asogwa Tochukwu Chijindu Department of Computer Science, Enugu State University of Science and Technology
Lois Onyejere Nwobodo Department of Computer Engineering, Enugu State University of Science and Technology

Keywords:

YOLO-V8n, COCO dataset, Blind guide, DVE, WFE

Abstract

Obstacle is an object positioned along a path of propagation with the potential to cause a collision and hence, an accident. Over the years, several papers have applied advanced computer vision techniques, particularly transfer learning algorithms, to solve this problem, but despite their success, in specific vision applications such as blind guide navigation systems, the model finds it difficult to distinguish between objects and obstacles recognized in the same video frame, hence attracting research attention. In this paper, the aim was to develop a blind navigation guide model for obstacle avoidance using distance vision estimation-based YOLO-V8n. To achieve this, an improved data model was developed using the MS COCO dataset and primary data collected from several indoor environments. Then, the YOLO-V8n architecture was improved by adding a Weighted Feature Enhancement (WFE) model to the backbone for improved feature extraction, and Bi-directional Feature Pyramid Network (Bi-FPN) was applied to the neck to improve multi-scale feature representation. In addition, a Distance Vision Estimation (DVE) algorithm was developed and applied to the Bi-FPN before connecting it to the head of the YOLO-V8n to facilitate simultaneous object detection and distance measurement in real-time video. Furthermore, the issue of bounding box overlap in the model was addressed by applying a Wise Intersection over Unit (WIoU) loss function. Collectively, these formulated the new transfer learning algorithm called YOLO-V8n+WFE+Bi-FPN+DVE+WIoU used in this work for high-level obstacle detection and distance estimation. The model was trained considering different experimental architectures of the YOLO-V8 and loss functions, respectively, and then evaluated with precision, recall, mean absolute precision, and average precision, respectively, before validation through comparative analysis. Upon selection of the best model, it was further validated through comparison with other state-of-the art algorithms before deployment for obstacle avoidance in an indoor environment, having satisfied the condition of reliability. Real world testing of the model was performed at four different indoor sites, and the results showed that while the model was able to correctly classify objects, it could also measure their distance accurately, thereby making it suitable for deployment as a blind vision guide navigation system.

Dimensions

REFERENCES

S. Alagarsamy, D. Rajkumar, L. Syamala & L. Niharika, “A real time object detection method for visually impaired using machine learning”, International Conference on Computer Communication and Informatics (ICCCI), Coimbatore, India, 2023, pp. 1-6. [Online]. https://doi.org/10.1109/ICCCI56745.2023.10128388.

L. Russel, “How does the eye work?”, Optimetrists, 2020. [Online]. https://www.optometrists.org/general-practice-optometry/guide-to-eye-health/how-does-the-eye-work/.

C. Sagana, P. Keerthika, R. Manjula Devi, M. Sangeetha, R. Abhilash, M. Dinesh Kumar & M. Hari-harasudhan, Object recognition system for visually impaired people, 2021 IEEE International Conference on Distributed Computing, VLSI, Electrical Circuits and Robotics (DISCOVER), Nitte, India, 2021, pp. 318-321. https://doi.org/10.1109/DISCOVER52564.2021.9663608.

S. Shubham, M. Sushruta, S. Kshira & N. Anand, “Research article vision navigator: a smart and intelligent obstacle recognition model for visually impaired users”, Hindawi Mobile Information Systems 2022 (2022), 9715891. https://doi.org/10.1155/2022/9715891

P. E. Kekong, A. Ajah & U. C. Ebere, “Real time drowsy driver monitoring and detection system using deep learning based behavioural approach”, International Journal of Computer Sciences and Engineering 9 (2019) 11. http://dx.doi.org/10.14569/IJACSA.2021.0120794.

P. O. Ugwoke, C. N. Udanor & F. S.Bakpo, “Deep learning algorithms for predicting the geographical locations of Pandemic disease patients from Global Positioning System (GPS) trajectory datasets research square”, 2023. [Online]. https://doi.org/10.21203/rs.3.rs-2770308/v1.

T. C. Asogwa & C. B. Onah, “Improving the performance of obstacle detection and avoidance autonomous mobile robot using transfer learning technique”, International Journal of Real-Time Application and Computing Systems 1 (2022) 85. https://ijortacs.com/paper?id=8op3ixmbzs.

Y. Chen, P. Yang, N. Zhang & J. Hou, “Edge-Assisted Lightweight Region-of-Interest Extraction and Transmission for Vehicle Perception”, 2023. [Online]. https://arxiv.org/abs/2308.16417.

S. Reddy, P. Khatravath, N. Surineni & R. Mulinti, Object detection and action recognition using computer vision, 2023 International Conference on Sustainable Computing and Smart Systems (ICSCSS), Coimbatore, India, 2023, pp. 874-879. https://doi.org/10.1109/ICSCSS57650.2023.10169620.

J. Au, D. Reidand & A. Bill, “Challenges and opportunities of computer vision applications in aircraft landing gear”, 2022 IEEE Aerospace Conference (AERO), Big Sky, MT, USA, 2022, pp. 1-10. https://doi.org/10.1109/AERO53065.2022.9843684.

I. C. Oliver, I. J. Odegwo, F. C. Obodoeze & L. O. Nwobodo,“Internet-of-things based real-time accident alert and reporting system for Nigeria”, International Journal of Trend in Scientific Research and Development (IJTSRD)6 (2022) 1171. https://www.researchgate.net/publication/364333800_Internet-Of-Things_Based_Real-time_Accident_Alert_and_Reporting_System_for_Nigeria.

W. Liu, D. Anguelov, D. Erhan, C. Szegedy,S. Reed, C. Y. Fu & A. C. Berg, “SSD: Single shot multi-box detector”, in Computer Vision-ECCV 2016. Lecture Notes in Computer Science, B. Leibe, J. Matas, N.Sebe, M. Welling (Ed.), Springer, Cham, Berlin/Heidelberg, 2016, pp. 21–37. https://doi.org/10.1007/978-3-319-46448-0_2.

E. U. Chidi, C. N. Udanor & E.Anoliefo, “Exploring the depths of visual understanding: a comprehensive review on real-time object of interest detection techniques”, 2024. Preprints. [Online]. https://doi.org/10.20944/preprints202402.0583.v1.

R. Girshick, J. Donahue, T. Darrell & J. Malik, “Rich feature hierarchies for accurate object detection and semantic segmentation”, 2014 IEEE Conference on Computer Vision and Pattern Recognition, Columbus, OH, USA, 23–28 June 2014, pp. 580–587. https://ieeexplore.ieee.org/document/6909475.

R. Girshick, Fast R-CNN, Proceedings of the 2015 IEEE International Conference on Computer Vision (ICCV), Santiago, Chile, 7–13 December 2015, pp. 1440–1448. https://ieeexplore.ieee.org/document/7410526.

K. He, G. Gkioxari, P. Dollár & R. Girshick, Mask R-CNN, Proceedings of the 2017 IEEE International Conference on Computer Vision (ICCV), Venice, Italy, 2017, pp. 2980–2988. https://arxiv.org/abs/1703.06870.

S. Ren, K. He, R .Girshick & J. Sun, “Faster R-CNN: towards real-time object detection with region proposal networks”, IEEE Trans. Pattern Anal. Mach. Intell. 39 (2017) 1137. https://ieeexplore.ieee.org/document/7485869.

A. Bochkovskiy, C. Y. Wang & H. Y. M. Liao, “YOLOv4: optimal speed and accuracy of object detection”. [Online]. https://doi.org/10.48550/arXiv.2004.10934.

E. Oluwaseyi, E. Martins & E. Abraham, “A comparative analysis of YOLOV-5 and YOLOV-7 object detection algorithms”, Journal of Computing and Social informatics 2 (2023) 1. http://dx.doi.org/10.33736/jcsi.5070.2023.

T. C. Asogwa, “Vision based behavioural approach to drowsy detection using viola jones algorithm and artificial neural network”, International Journal of Trend in Research and Development 9 (2019) 202. https://www.ijtrd.com/papers/IJTRD25202.pdf.

J. Qi, Y. Gao & Y. Hu, “Occluded video instance segmentation: a benchmark”, Int J Comput Vis 130 (2022) 2022. https://doi.org/10.1007/s11263-022-01629-1.

X. Zhan, X. Pan, B. Dai, Z. Liu, D. Lin, & C. C. Loy, “Self-supervised scene de-occlusion”, 2020. [Online]. https://arxiv.org/abs/2004.02788.

A. Kortylewski, Q. Liu, A. Wang, Y. Sun & A. Yuille, “Compositional convolutional neural networks: a robust and interpretable model for object recognition under occlusion”, Int J Comput Vis. 129 (2021) 736. https://doi.org/10.1007/s11263-020-01401-3.

S. Zhang, L. Wen, X. Bian, Z. Lei, & S. Z. Li, “Occlusion-aware R-CNN: Detecting pedestrians in a crowd”, 2018. [Online]. https://doi.org/10.48550/arXiv.1807.08407.

J. Wang, J. Wu, J. Wu, J. Wang & J. Wang, “YOLOv7 optimization model based on attention mechanism applied in dense scenes”, Appl. Sci 13 (2023) 9173. https://doi.org/10.3390/app13169173.

S. Ryu & K. Chung,“Detection model of occluded object based on yolo using hard-example mining and augmentation policy optimization”, Appl. Sci. 11 (2021) 7093. https://doi.org/10.3390/app11157093.

Y. Li, S. Li, H. Du, L. Chen, D. Zhang & Y. Li, “YOLO-ACN: focusing on small target and occluded object detection”, IEEE Access Digital Object Identifier 10 (2020) 1109. https://doi.org/10.1109/ACCESS.2020.3046515.

Z. Yu,H. Huang, W. Chen, Y. Su, Y. Liu, & X. Wang, “YOLO-FaceV2: a scale and occlusion aware face detector”, 2022. [Online]. https://doi.org/10.48550/arXiv.2208.02019.

Y. Zhao & S. Geng, “Face occlusion detection algorithm based on YOLOV5”, Journal of Physics: Conference Series 2031 (2021) 012053. https://doi.org/10.1088/1742-6596/2031/1/012053.

N. Bodla, B. Singh, R. Chellappa & L. S. Davis, Soft-NMS–improving object detection with one line of code, Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, 22–29 October 2017. pp. 5561–5569. https://arxiv.org/abs/1704.04503.

S. Liu, D. Huang & Y. Wang, Adaptive NMS: refining pedestrian detection in a crowd, Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2019. pp. 6459–6468. https://doi.org/10.48550/arXiv.1904.03629.

Y. Liu, L. Liu, H.Rezatofighi, T. T. Do, Q. Shi & I. Reid, “Learning pairwise relationship for multi-object detection in crowded scenes”, 2019. [Online]. https://doi.org/10.48550/arXiv.1901.03796.

S. Some, M. D. Gupta & V. P. Namboodiri, “Determinantal point process as an alternative to NMS”, 2020. [Online]. https://doi.org/10.48550/arXiv.2008.11451.

A. J. Shepley, G. Falzon, P. Kwan & L. Brankovic, “Confluence: a robust non-iou alternative to non-maxima suppression in object detection”, IEEE Trans. Pattern Anal. Mach. Intell. 45 (2023) 11561–11574. https://ieeexplore.ieee.org/document/10119209.

Q. Zhang, L. Chen, M. Shao, H. Liang & J. Ren, “ESAMask: real-time instance segmentation fused with efficient sparse attention”, Sensors 23 (2023) 6446. https://doi.org/10.3390/s23146446

H. Li, J. Li, H. Wei, Z. Liu, Z. Zhan & Q. Ren, “Slim-neck by GSConv: a better design paradigm of detector architectures for autonomous vehicles”, J Real-Time Image Proc 21 (2022) 62. https://doi.org/10.1007/s11554-024-01436-6.

L. Yang, R. Y. Zhang, L. Li & X. Xie, Simam: a simple, parameter-free attention module for convolutional neural networks, Proceedings of the International Conference on Machine Learning, [Online], 18–24 July 2021. pp. 11863–11874. https://proceedings.mlr.press/v139/yang21o.html.

X. Yue, K. Qi, X. Na, Y. Zhang, Y. Liu & C. Liu, “Improved YOLOv8-Seg network for instance segmentation of healthy and diseased tomato plants in the growth stage”, Agriculture 13 (2023) 1643. https://doi.org/10.3390/agriculture13081643.

C. Li, L. Li, H. Jiang, K. Weng, Y. Geng, L. Li, Z. Ke, Q. Li, M. Cheng & W. Nie, “YOLOv6: a single-stage object detection framework for industrial applications”, 2022. [Online]. https://ar5iv.labs.arxiv.org/html/2209.02976.

K. Weng, X. Chu, X. Xu, J. Huang & X. Wei, “EfficientRep: an efficient repvgg-style convnets with hardware-aware neural network design”, 2023. [Online]. https://arxiv.org/abs/2302.00386.

Y. Wang, H. Zou, M. Yin & X. Zhang, “SMFF-YOLO: a scale-adaptive YOLO algorithm with multi-level feature fusion for object detection in uav scenes”, Remote Sens 15 (2023) 4580. https://doi.org/10.3390/rs15184580.

J. Hua, T. Hao, L. Zeng & G. Yu, “YOLOMask: an instance segmentation algorithm based on complementary fusion network”, Mathematics” 9 (2021) 1766. https://doi.org/10.3390/math9151766.

T. J. Alahmadi, A. U. Rahman, H. K. Alkahtani & H. Kholidy, “Enhancing object detection for vips using yolov4 resnet101 and text-to-speech conversion model”, Multimodal Technol. Interact 7 (2023) 77. https://doi.org/10.3390/mti7080077.

D. Reis, J. Kupec, J. Hong & A. Daoudi, “Real-time flying object detection with Yolov8”, 2023. [Online]. https://doi.org/10.48550/arXiv.2305.09972/

Z. Huangfu & S. Li, “Lightweight you only look once V8: an upgraded you only look once v8 algorithm for small object identification in unmanned aerial vehicle images”, Appl. Sci. 13 (2023) 12369. https://doi.org/10.3390/app132212369.

N. Li, T. Ye, Z. Zhou, C. Gao & P. Zhang, “Enhanced YOLOv8 with BiFPN-SimAM for precise defect detection in miniature capacitors”, Appl. Sci. 14 (2024) 429. https://doi.org/10.3390/app14010429.

Z. Qu, L. Y. Gao, S. Y. Wang, H. N. Yin & T. M. Yi, “An improved YOLOv5 method for large objects detection with multi-scale feature cross-layer fusion network”, Image Vis. Comput. 125 (2022) 104518. https://doi.org/10.1016/j.imavis.2022.104518.

V. Chiley, V.Thangarasa, A. Gupta, A. Samar, J. Hestness & D. De Coste, “RevBiFPN: the fully reversible bidirectional feature pyramid network”, 2023. https://doi.org/10.48550/arXiv.2206.14098.

Y. Hao, V. C. Tai & Y. C. Tan, “A systematic stereo camera calibration strategy: leveraging latin hypercube sampling and 2k full-factorial design of experiment methods”, Sensors 23 (2023) 8240. https://doi.org/10.3390/s23198240.

Z. J. Khow, Y. F. Tan, H. A. Karim & H. A. A. Rashid, “Improved YOLOv8 model for a comprehensive approach to object detection and distance estimation”, IEEE Access 12 (2024) 63754 https://ieeexplore.ieee.org/document/10517525.

J. Yu, Y. Jiang, Z. Wang, Z. Cao & T. Huang, UnitBox: an advanced object detection network, Proceedings of the 24th ACM International Conference on Multimedia, pp. 516–520. https://doi.org/10.1145/2964284.2967274.

H. Rezatofighi, N. Tsoi, J. Gwak, A. Sadeghian, I. Reid & S. Savarese, Generalized intersection over union: a metric and a loss for bounding box regression, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 2019, pp. 658-666 (2019). https://doi.ieeecomputersociety.org/10.1109/CVPR.2019.00075.

Q. Zhao, H. Wei, & X. Zhai, “Improving tire specification character recognition in the Yolov5 network”, Applied Sciences 13 (2023) 12. https://doi.org/10.3390/app13127310.

W. Liu, D. Anguelov, D. Erhan, C. Szegedy, S. Reed, C-Y. Fu & A. C. Berg, “SSD: single shot multi-box detector”, in Computer Vision—ECCV, Lecture Notes in Computer Science, B. Leibe, J. Matas, N.Sebe, M. Welling, (Ed.), Springer, Cham, Switzerland, 2016, pp. 21–37. https://doi.org/10.1007/978-3-319-46448-0_2.

S. Ren, K. He, R.Girshick & J. Sun, Faster R-CNN: Towards real-time object detection with region proposal networks, Proceedings of the Advances in Neural Information Processing Systems 28 (NIPS 2015), Montreal, QC, Canada, 7–12 December 2015, pp. 91–99. https://doi.org/10.48550/arXiv.1506.01497.

J. Wang, Y. Chen, Z. Dong & M. Gao, “Improved YOLOv5 network for real-time multi-scale traffic sign detection”, Neural Comput. Appl. 35 (2022) 7853. https://doi.org/10.1007/s00521-022-08077-5.

C. Wang, W. He, Y. Nie, J. Guo, C. Liu, K. Han & Y. Wang, “Gold-YOLO: efficient object detector via gather-and-distribute mechanism”, 2023. [Online]. https://doi.org/10.48550/arXiv.2309.11331.

C-Y. Wang, A. Bochkovskiy & H-Y. M. Liao, YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors, Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Vancouver, BC, Canada, 17–24 June 2023. pp. 7464–7475. https://arxiv.org/abs/2207.02696.

X. Zhu, S. Lyu, X. Wang & Q. Zhao, “TPH-YOLOv5: improved YOLOv5 based on transformer prediction head for object detection on drone-captured scenarios”, 2021. [Online]. https://doi.org/10.48550/arXiv.2108.11539.