Addressing class imbalance in lassa fever epidemic data, using machine learning: a case study with SMOTE and random forest
Keywords:
Lassa fever, Machine learning, SMOTE, Random forest, Class imbalanceAbstract
Class imbalance in epidemiological datasets, particularly for rare outcomes like Lassa Fever fatalities, complicates predictive modeling. This study addresses the issue by employing SMOTE to rebalance the dataset and Random Forest for classification while identifying significant predictors such as age, symptom severity, and residence. SMOTE successfully balanced the dataset (minority class recall improved from 0.60 to 1.00 in Random Forest), mitigating the bias toward majority classes. Without SMOTE, models including Random Forest, XGBoost, and LightGBM achieved high accuracy (> 99%) but demonstrated poor minority recall (?0.75), confirming the challenge of imbalanced data. Post-SMOTE balancing, these models achieved 100% accuracy, precision, recall, and F1-scores across major classes. Notably, the hybrid ensemble model further enhanced outcomes, achieving an F1-score of 0.80 for the rarest class. These results underscore the superiority of SMOTE in improving classification for underrepresented outcomes compared to reliance on Random Forest alone, demonstrating its value in developing equitable predictive tools for outbreak management.
Published
How to Cite
Issue
Section
Copyright (c) 2025 Osowomuabe Njama-Abang, Denis U. Ashishie, Paul T. Bukie

This work is licensed under a Creative Commons Attribution 4.0 International License.
How to Cite
Similar Articles
- V Umarani, A Julian, J Deepa, Sentiment Analysis using various Machine Learning and Deep Learning Techniques , Journal of the Nigerian Society of Physical Sciences: Volume 3, Issue 4, November 2021
- Gabriel James, Anietie Ekong, Etimbuk Abraham, Enobong Oduobuk, Peace Okafor, Analysis of support vector machine and random forest models for predicting the scalability of a broadband network , Journal of the Nigerian Society of Physical Sciences: Volume 6, Issue 3, August 2024
- Gabriel James, Ime Umoren, Anietie Ekong, Saviour Inyang, Oscar Aloysius, Analysis of support vector machine and random forest models for classification of the impact of technostress in covid and post-covid era , Journal of the Nigerian Society of Physical Sciences: Volume 6, Issue 3, August 2024
- David Opeoluwa Oyewola, Emmanuel Gbenga Dada, Juliana Ngozi ndunagu, Terrang Abubakar Umar, Akinwunmi S.A, COVID-19 Risk Factors, Economic Factors, and Epidemiological Factors nexus on Economic Impact: Machine Learning and Structural Equation Modelling Approaches , Journal of the Nigerian Society of Physical Sciences: Volume 3, Issue 4, November 2021
- Philemon Uten Emmoh, Christopher Ifeanyi Eke, Timothy Moses, A feature selection and scoring scheme for dimensionality reduction in a machine learning task , Journal of the Nigerian Society of Physical Sciences: Volume 7, Issue 1, February 2025
- A. B Yusuf, R. M Dima, S. K Aina, Optimized Breast Cancer Classification using Feature Selection and Outliers Detection , Journal of the Nigerian Society of Physical Sciences: Volume 3, Issue 4, November 2021
- O. J. Ibidoja, F. P. Shan, Mukhtar, J. Sulaiman, M. K. M. Ali, Robust M-estimators and Machine Learning Algorithms for Improving the Predictive Accuracy of Seaweed Contaminated Big Data , Journal of the Nigerian Society of Physical Sciences: Volume 5, Issue 1, February 2023
- Christian N. Nwaeme, Adewale F. Lukman, Robust hybrid algorithms for regularization and variable selection in QSAR studies , Journal of the Nigerian Society of Physical Sciences: Volume 5, Issue 4, November 2023
- Omodele Olubi, Ebeneze Oniya, Taoreed Owolabi, Development of Predictive Model for Radon-222 Estimation in the Atmosphere using Stepwise Regression and Grid Search Based-Random Forest Regression , Journal of the Nigerian Society of Physical Sciences: Volume 3, Issue 2, May 2021
- O. Oderinde, C. L. Mgbechidinma, A. O. Agbeja, A. A. Ajayi, A. O. Ogundiran, O. O. Olaide, O. A. Orelaja, C. A. Mgbechidimma, C. O. Ajanaku, K. D. Oyeyemi, Appraising raw exhaust pollutant gases emissions from industrial generators using statistics and machine learning approaches , Journal of the Nigerian Society of Physical Sciences: Volume 7, Issue 4, November 2025
You may also start an advanced similarity search for this article.
Most read articles by the same author(s)
- Paul Tawo Bukie, Idongesit E. Eteng, Eyo E. Essien, Development of internet of things-based petroleum pipeline topology leak monitoring and detection system using sensors , Journal of the Nigerian Society of Physical Sciences: Volume 7, Issue 4, November 2025
- Catherine N. Ogbizi-Ugbe, Osowomuabe Njama-Abang, Samuel Oladimeji, Idongetsit E. Eteng, Edim A. Emanuel, Synergistic intelligence: a novel hybrid model for precision agriculture using k-means, naive Bayes, and knowledge graphs , Journal of the Nigerian Society of Physical Sciences: Volume 8, Issue 1, February 2026

