Efficient-ViT B0Net: A high-performance light weight transformer for rice leaf disease recognition and classification
Keywords:
Plant disease, Deep learning, Efficient net B0, Vision transformerAbstract
Plant disease detection has become a demanding and challenging task in today’s environment because many different types of plants exist world-wide, and very varied infections are found in them. The proposed work introduced a hybrid architecture to perform plant disease recognition and classification accurately and efficiently. The proposed model utilizes the strengths of CNN and Vision Transformer, where CNN successfully extracts local fine-grained texture features quickly. At the same time, ViT plays a vital role in extracting global and deep features from the leaf images. The suggested model was evaluated on a rice leaf dataset for paddy disease recognition and classification. The dataset consists of images representing four different types of rice leaves, with each class containing 4,000 samples. It includes healthy and diseased leaves, where the diseased category is further divided into three specific classes: Brown Spot, Bacterial Leaf Blight, and Leaf Smut. The suggested model worked well on the input dataset and achieved a testing accuracy of 99.13%. The Precision, recall, and F1 score of the proposed model were recorded as 99.13%, 99.13%, and 99.13%, respectively. The proposed method achieves a classification accuracy of 99.13%, outperforming SOTA models such as ViT-small, DenseNet121, ResNet50, EfficientNet B0 and SqueezeNet by 2–9% on the same dataset. The proposed method was compared with other approaches on the same experimental environment. These results demonstrate the effectiveness of our EfficientNet-ViT-based pipeline in capturing both local and global features for accurate rice disease classification.

Published
How to Cite
Issue
Section
Copyright (c) 2025 Santosh Kumar Upadhyay, Rajesh Prasad

This work is licensed under a Creative Commons Attribution 4.0 International License.