Feature-optimized hybrid CNN–ViT architecture for sustainable vision-based condition assessment in agriculture

Authors

  • Muhammad Musa Liman
    Department of Computer Science African University of Science and Technology, Abuja, Nigeria
  • Rajesh Prasad
    Department of Computer Science African University of Science and Technology, Abuja, Nigeria
  • Hauwa Ahmad Amshi
    Department of Computer Science African University of Science and Technology, Abuja, Nigeria
    Department of Computer Science Federal University, Gashua, Nigeria

Keywords:

Plant disease detection, Hybrid CNN–ViT, Multi-crop classification, Vision transformer, Feature engineering

Abstract

Detecting structural and physiological changes in plants earlier is a difficult challenge for computer vision, due to large intra-class variation and environmental noise. This paper integrates feature enhancement using vegetation indices (ExG, ExR) with feature compression using Principal Component Analysis (PCA) and an asymmetric CNN-ViT fusion architecture for plant disease classification with multi-crop inputs. Preprocessing consists of extracting vegetation indices (ExG and ExR), performing statistical normalization, and applying PCA-based feature compression to both enhance discriminative ability and reduce redundant spectral information. In particular, the CNN component generates hierarchical texture encoding, while the ViT component produces self-attention encoding, which is better suited for capturing global associations. The complementary feature spaces are combined via a cross-domain fusion layer, thereby improving representation capability. The proposed system achieves very high classification accuracy (98%) and robustness across multiple crop datasets. While there are still aspects of edge efficiency and explainability that need to be addressed before the model can be deployed in a real-world agricultural scenario, these are outlined as future work.

Dimensions

[1] T. Ben-Hassen, H. El-Bilali, B. Daher & S. Burkart, “Editorial: Sustainable and resilient food systems in times of crises”, Frontiers in Nutrition 12 (2025) 1. https://doi.org/10.3389/fnut.2025.1564950.

[2] S. Savary, L. Willocquet, S. J. Pethybridge, P. Esker, N. McRoberts & A. Nelson, “The global burden of pathogens and pests on major food crops”, Nature Ecology & Evolution 3 (2019) 430. https://doi.org/10.1038/s41559-018-0793-y.

[3] K. A. Garrett, S. P. Dendy, E. E. Frank, M. N. Rouse & S. E. Travers, “Climate change effects on plant disease: Genomes to ecosystems”, Annual Review of Phytopathology 44 (2006) 489. https://doi.org/10.1146/annurev.phyto.44.070505.143420.

[4] S. Jauhari & K. K. Agrawal, “A Comprehensive Review on Various Plant Diseases and Impact on Crop Yield and Quality”, Journal of Information Systems Engineering and Management 10 (2025) 2468. https://doi.org/10.52783/jisem.v10i38s.6866.

[5] M. Chithambarathanu & M. K. Jeyakumar, “Survey on crop pest detection using deep learning and machine learning approaches”, Multimedia Tools and Applications 82 (2023) 42277. https://doi.org/10.1007/s11042-023-15221-3.

[6] H. Ghosh, I. S. Rahat, K. Shaik, S. Khasim & M. Yesubabu, “Potato Leaf Disease Recognition and Prediction using Convolutional Neural Networks”, EAI Endorsed Transactions on Scalable Information Systems 10 (2023) 1. https://doi.org/10.4108/eetsis.3937.

[7] S. K. Upadhyay & R. Prasad, “Efficient-ViT B0Net: A high-performance lightweight transformer for rice leaf disease recognition and classification”, Journal of the Nigerian Society of Physical Sciences 7 (2025) 1. https://doi.org/10.46481/jnsps.2025.2940.

[8] A. Kamilaris & F. X. Prenafeta-Boldu, “Deep learning in agriculture: A´ survey”, Computers and Electronics in Agriculture 147 (2018) 70. https://doi.org/10.1016/j.compag.2018.02.016.

[9] L. Weng, Z. Tang, M. F. Sardar, Y. Yu, K. Ai, S. Liang, J. Alhahtani & D. Lyv, “Unveiling the frontiers of potato disease research through bibliometric analysis”, Frontiers in Microbiology 15 (2024) 1. https://doi.org/10.3389/fmicb.2024.1430066.

[10] J. H. Sinamenye, A. Chatterjee & R. Shrestha, “Potato plant disease detection: leveraging hybrid deep learning models”, BMC Plant Biology 25 (2025) 647. https://doi.org/10.1186/s12870-025-06679-4.

[11] A. Ait Nasser & M. A. Akhloufi, “A Hybrid Deep Learning Architecture for Apple Foliar Disease Detection”, Computers 13 (2024) 116. https://doi.org/10.3390/computers13050116.

[12] S. F. Santoso, S. Hadi, B. Nugroho & I. G. S. Mas Diyasa, “Implementation of Hybrid EfficientNet V2 And Vision Transformer for Apple Leaf Diseases Classification”, Information Technology International Journal 3 (2025) 1. https://doi.org/10.33005/itij.v3i1.42.

[13] C. Zhou, X. Ge, Y. Chang, M. Wang, Z. Shi, M. Ji, T. Wu & C. Lv, “A Multimodal Parallel Transformer Framework for Apple Disease Detection and Severity Classification with Lightweight Optimization”, Agronomy 15 (2025) 1246. https://doi.org/10.3390/agronomy15051246.

[14] I. Pacal, I. Kunduracioglu, M. H. Alma, M. Deveci, S. Kadry, J. Nedoma, V. Slany & R. Martinek, “A systematic review of deep learning techniques for plant diseases”, Artificial Intelligence Review 57 (2024) 304. https://doi.org/10.1007/s10462-024-10944-7.

[15] V. Tiwari, R. C. Joshi & M. K. Dutta, “Dense convolutional neural networks based multiclass plant disease detection and classification using leaf images”, Ecological Informatics 63 (2021) 101289. https://doi.org/10.1016/j.ecoinf.2021.101289.

[16] M. Xu, J. E. Park, J. Lee, J. Yang & S. Yoon, “Plant disease recognition datasets in the age of deep learning: challenges and opportunities”, Frontiers in Plant Science 15 (2024) 1. https://doi.org/10.3389/fpls.2024.1452551.

[17] A. Upadhyay, N. S. Chandel, K. P. Singh, S. K. Chakraborty, B. M. Nandede, M. Kumar, A. Subeesh, K. Upendar, A. Salem & A. Elbeltagi, “Deep learning and computer vision in plant disease detection: a comprehensive review of techniques, models, and trends in precision agriculture”, Artificial Intelligence Review 58 (2025) 92. https://doi.org/10.1007/s10462-024-11100-x.

[18] S. R. Trivedi & N. Sharma, “A dynamic deep learning framework for real-time multi-plant, multi-disease detection under diverse environmental conditions”, International Journal of Information Technology (2025). https://doi.org/10.1007/s41870-025-02969-0.

[19] Y. Haruna, S. Qin, A. H. Adama Chukkol, A. A. Yusuf, I. Bello & A. Lawan, “Exploring the synergies of hybrid convolutional neural network and Vision Transformer architectures for computer vision: A survey”, Engineering Applications of Artificial Intelligence 144 (2025) 110057. https://doi.org/10.1016/j.engappai.2025.110057.

[20] Y. N. Kuan, K. M. Goh & L. L. Lim, “Systematic review on machine learning and computer vision in precision agriculture: Applications, trends, and emerging techniques”, Engineering Applications of Artificial Intelligence 148 (2025) 110401. https://doi.org/10.1016/j.engappai.2025.110401.13

[21] S. A. Salihu, S. O. Adebayo, O. C. Abikoye, F. E. Usman-Hamza, M. A. Mabayoje, B. Brahma & A. Bandyopadhyay, “Detection and Classification of Potato Leaves Diseases Using Convolutional Neural Network and Adam Optimizer”, Procedia Computer Science 258 (2025) 2. https://doi.org/10.1016/j.procs.2025.04.159.

[22] G. E. Meyer & J. C. Neto, “Verification of color vegetation indices for automated crop imaging applications”, Computers and Electronics in Agriculture 63 (2008) 282. https://doi.org/10.1016/j.compag.2008.03.009.

[23] I. T. Jolliffe & J. Cadima, “Principal component analysis: A review and recent developments”, Philosophical Transactions of the Royal Society A 374 (2016) 20150202. https://doi.org/10.1098/rsta.2015.0202.

[24] Y. Lecun, L. Bottou, Y. Bengio & P. Haffner, “Gradient-based learning applied to document recognition”, Proceedings of the IEEE 86 (1998) 2278. https://doi.org/10.1109/5.726791.

[25] M. Shafiq & Z. Gu, “Deep Residual Learning for Image Recognition: A Survey”, Applied Sciences 12 (2022) 8972. https://doi.org/10.3390/app12188972.

[26] M. Tan & Q. V. Le, “EfficientNet: Rethinking model scaling for convolutional neural networks”, Proceedings of the 36th International Conference on Machine Learning (2019) 1–10. https://arxiv.org/abs/1905.11946.

[27] A. Dosovitskiy, L. Beyer, A. Kolesnikov, D. Weissenborn, X. Zhai, T. Unterthiner, M. Dehghani, M. Minderer, G. Heigold, S. Gelly, J. Uszkoreit & N. Houlsby, “An image is worth 16 × 16 words: transformers for image recognition at scale”, Proceedings of ICLR (2021) 1–22. https://arxiv.org/abs/2010.11929.

Pic 2

Published

2026-05-14

How to Cite

Feature-optimized hybrid CNN–ViT architecture for sustainable vision-based condition assessment in agriculture. (2026). Journal of the Nigerian Society of Physical Sciences, 8(2), 3301. https://doi.org/10.46481/jnsps.2026.3301

Issue

Section

Computer Science

How to Cite

Feature-optimized hybrid CNN–ViT architecture for sustainable vision-based condition assessment in agriculture. (2026). Journal of the Nigerian Society of Physical Sciences, 8(2), 3301. https://doi.org/10.46481/jnsps.2026.3301

Similar Articles

1-10 of 229

You may also start an advanced similarity search for this article.

Most read articles by the same author(s)