Feature-optimized hybrid CNN–ViT architecture for sustainable vision-based condition assessment in agriculture

Muhammad Musa Liman; Rajesh Prasad; Hauwa Ahmad Amshi

doi:10.46481/jnsps.2026.3301

Authors

Muhammad Musa Liman
[email protected]

Department of Computer Science African University of Science and Technology, Abuja, Nigeria
Rajesh Prasad
Department of Computer Science African University of Science and Technology, Abuja, Nigeria
Hauwa Ahmad Amshi
Department of Computer Science African University of Science and Technology, Abuja, Nigeria

Department of Computer Science Federal University, Gashua, Nigeria

Keywords:

Plant disease detection, Hybrid CNN–ViT, Multi-crop classification, Vision transformer, Feature engineering

Abstract

Detecting structural and physiological changes in plants earlier is a difficult challenge for computer vision, due to large intra-class variation and environmental noise. This paper integrates feature enhancement using vegetation indices (ExG, ExR) with feature compression using Principal Component Analysis (PCA) and an asymmetric CNN-ViT fusion architecture for plant disease classification with multi-crop inputs. Preprocessing consists of extracting vegetation indices (ExG and ExR), performing statistical normalization, and applying PCA-based feature compression to both enhance discriminative ability and reduce redundant spectral information. In particular, the CNN component generates hierarchical texture encoding, while the ViT component produces self-attention encoding, which is better suited for capturing global associations. The complementary feature spaces are combined via a cross-domain fusion layer, thereby improving representation capability. The proposed system achieves very high classification accuracy (98%) and robustness across multiple crop datasets. While there are still aspects of edge efficiency and explainability that need to be addressed before the model can be deployed in a real-world agricultural scenario, these are outlined as future work.

Dimensions

REFERENCES

[1] T. Ben-Hassen, H. El-Bilali, B. Daher & S. Burkart, “Editorial: Sustainable and resilient food systems in times of crises”, Frontiers in Nutrition 12 (2025) 1. https://doi.org/10.3389/fnut.2025.1564950.

[2] S. Savary, L. Willocquet, S. J. Pethybridge, P. Esker, N. McRoberts & A. Nelson, “The global burden of pathogens and pests on major food crops”, Nature Ecology & Evolution 3 (2019) 430. https://doi.org/10.1038/s41559-018-0793-y.

[3] K. A. Garrett, S. P. Dendy, E. E. Frank, M. N. Rouse & S. E. Travers, “Climate change effects on plant disease: Genomes to ecosystems”, Annual Review of Phytopathology 44 (2006) 489. https://doi.org/10.1146/annurev.phyto.44.070505.143420.

[4] S. Jauhari & K. K. Agrawal, “A Comprehensive Review on Various Plant Diseases and Impact on Crop Yield and Quality”, Journal of Information Systems Engineering and Management 10 (2025) 2468. https://doi.org/10.52783/jisem.v10i38s.6866.

[5] M. Chithambarathanu & M. K. Jeyakumar, “Survey on crop pest detection using deep learning and machine learning approaches”, Multimedia Tools and Applications 82 (2023) 42277. https://doi.org/10.1007/s11042-023-15221-3.

[6] H. Ghosh, I. S. Rahat, K. Shaik, S. Khasim & M. Yesubabu, “Potato Leaf Disease Recognition and Prediction using Convolutional Neural Networks”, EAI Endorsed Transactions on Scalable Information Systems 10 (2023) 1. https://doi.org/10.4108/eetsis.3937.

[7] S. K. Upadhyay & R. Prasad, “Efficient-ViT B0Net: A high-performance lightweight transformer for rice leaf disease recognition and classification”, Journal of the Nigerian Society of Physical Sciences 7 (2025) 1. https://doi.org/10.46481/jnsps.2025.2940.

[8] A. Kamilaris & F. X. Prenafeta-Boldu, “Deep learning in agriculture: A´ survey”, Computers and Electronics in Agriculture 147 (2018) 70. https://doi.org/10.1016/j.compag.2018.02.016.

[9] L. Weng, Z. Tang, M. F. Sardar, Y. Yu, K. Ai, S. Liang, J. Alhahtani & D. Lyv, “Unveiling the frontiers of potato disease research through bibliometric analysis”, Frontiers in Microbiology 15 (2024) 1. https://doi.org/10.3389/fmicb.2024.1430066.

[10] J. H. Sinamenye, A. Chatterjee & R. Shrestha, “Potato plant disease detection: leveraging hybrid deep learning models”, BMC Plant Biology 25 (2025) 647. https://doi.org/10.1186/s12870-025-06679-4.

[11] A. Ait Nasser & M. A. Akhloufi, “A Hybrid Deep Learning Architecture for Apple Foliar Disease Detection”, Computers 13 (2024) 116. https://doi.org/10.3390/computers13050116.

[12] S. F. Santoso, S. Hadi, B. Nugroho & I. G. S. Mas Diyasa, “Implementation of Hybrid EfficientNet V2 And Vision Transformer for Apple Leaf Diseases Classification”, Information Technology International Journal 3 (2025) 1. https://doi.org/10.33005/itij.v3i1.42.

[13] C. Zhou, X. Ge, Y. Chang, M. Wang, Z. Shi, M. Ji, T. Wu & C. Lv, “A Multimodal Parallel Transformer Framework for Apple Disease Detection and Severity Classification with Lightweight Optimization”, Agronomy 15 (2025) 1246. https://doi.org/10.3390/agronomy15051246.

[14] I. Pacal, I. Kunduracioglu, M. H. Alma, M. Deveci, S. Kadry, J. Nedoma, V. Slany & R. Martinek, “A systematic review of deep learning techniques for plant diseases”, Artificial Intelligence Review 57 (2024) 304. https://doi.org/10.1007/s10462-024-10944-7.

[15] V. Tiwari, R. C. Joshi & M. K. Dutta, “Dense convolutional neural networks based multiclass plant disease detection and classification using leaf images”, Ecological Informatics 63 (2021) 101289. https://doi.org/10.1016/j.ecoinf.2021.101289.

[16] M. Xu, J. E. Park, J. Lee, J. Yang & S. Yoon, “Plant disease recognition datasets in the age of deep learning: challenges and opportunities”, Frontiers in Plant Science 15 (2024) 1. https://doi.org/10.3389/fpls.2024.1452551.

[17] A. Upadhyay, N. S. Chandel, K. P. Singh, S. K. Chakraborty, B. M. Nandede, M. Kumar, A. Subeesh, K. Upendar, A. Salem & A. Elbeltagi, “Deep learning and computer vision in plant disease detection: a comprehensive review of techniques, models, and trends in precision agriculture”, Artificial Intelligence Review 58 (2025) 92. https://doi.org/10.1007/s10462-024-11100-x.

[18] S. R. Trivedi & N. Sharma, “A dynamic deep learning framework for real-time multi-plant, multi-disease detection under diverse environmental conditions”, International Journal of Information Technology (2025). https://doi.org/10.1007/s41870-025-02969-0.

[19] Y. Haruna, S. Qin, A. H. Adama Chukkol, A. A. Yusuf, I. Bello & A. Lawan, “Exploring the synergies of hybrid convolutional neural network and Vision Transformer architectures for computer vision: A survey”, Engineering Applications of Artificial Intelligence 144 (2025) 110057. https://doi.org/10.1016/j.engappai.2025.110057.

[20] Y. N. Kuan, K. M. Goh & L. L. Lim, “Systematic review on machine learning and computer vision in precision agriculture: Applications, trends, and emerging techniques”, Engineering Applications of Artificial Intelligence 148 (2025) 110401. https://doi.org/10.1016/j.engappai.2025.110401.13

[21] S. A. Salihu, S. O. Adebayo, O. C. Abikoye, F. E. Usman-Hamza, M. A. Mabayoje, B. Brahma & A. Bandyopadhyay, “Detection and Classification of Potato Leaves Diseases Using Convolutional Neural Network and Adam Optimizer”, Procedia Computer Science 258 (2025) 2. https://doi.org/10.1016/j.procs.2025.04.159.

[22] G. E. Meyer & J. C. Neto, “Verification of color vegetation indices for automated crop imaging applications”, Computers and Electronics in Agriculture 63 (2008) 282. https://doi.org/10.1016/j.compag.2008.03.009.

[23] I. T. Jolliffe & J. Cadima, “Principal component analysis: A review and recent developments”, Philosophical Transactions of the Royal Society A 374 (2016) 20150202. https://doi.org/10.1098/rsta.2015.0202.

[24] Y. Lecun, L. Bottou, Y. Bengio & P. Haffner, “Gradient-based learning applied to document recognition”, Proceedings of the IEEE 86 (1998) 2278. https://doi.org/10.1109/5.726791.

[25] M. Shafiq & Z. Gu, “Deep Residual Learning for Image Recognition: A Survey”, Applied Sciences 12 (2022) 8972. https://doi.org/10.3390/app12188972.

[26] M. Tan & Q. V. Le, “EfficientNet: Rethinking model scaling for convolutional neural networks”, Proceedings of the 36th International Conference on Machine Learning (2019) 1–10. https://arxiv.org/abs/1905.11946.

[27] A. Dosovitskiy, L. Beyer, A. Kolesnikov, D. Weissenborn, X. Zhai, T. Unterthiner, M. Dehghani, M. Minderer, G. Heigold, S. Gelly, J. Uszkoreit & N. Houlsby, “An image is worth 16 × 16 words: transformers for image recognition at scale”, Proceedings of ICLR (2021) 1–22. https://arxiv.org/abs/2010.11929.