A feature selection and scoring scheme for dimensionality reduction in a machine learning task
Keywords:
Algorithm, Dataset, Dimensionality reduction, Feature selectionAbstract
Selection of important features is very vital in machine learning tasks involving high-dimensional dataset with large features. It helps in reducing the dimensionality of a dataset and improving model performance. Most of the feature selection techniques have restriction in the kind of dataset to be used. This study proposed a feature selection technique that is based on statistical lift measure to select important features from a dataset. The proposed technique is a generic approach that can be used in any binary classification dataset. The technique successfully determined the most important feature subset and outperformed the existing techniques. The proposed technique was tested on lungs cancer dataset and happiness classification dataset. The effectiveness of the proposed technique in selecting important features subset was evaluated and compared with other existing techniques, namely Chi-Square, Pearson Correlation and Information Gain. Both the proposed and the existing techniques were evaluated on five machine learning models using four standard evaluation metrics such as accuracy, precision, recall and F1-score. The experimental results of the proposed technique on lung cancer dataset shows that logistic regression, decision tree, adaboost, gradient boost and random forest produced a predictive accuracy of 0.919%, 0.935%, 0.919%, 0.935% and 0.935% respectively, and that of happiness classification dataset produced a predictive accuracy of 0.758%, 0.689%, 0.724%, 0.655% and 0.689% on random forest, k-nearest neighbor, decision tree, gradient boost and cat boost respectively, which outperformed the existing techniques.
Published
How to Cite
Issue
Section
Copyright (c) 2024 Philemon Uten Emmoh, Christopher Ifeanyi Eke, Timothy Moses

This work is licensed under a Creative Commons Attribution 4.0 International License.
How to Cite
Similar Articles
- Monika Saini, Naveen Kumar, Deepak Sinwar, Ashish Kumar, Availability optimization of bolts manufacturing plant using particle swarm optimization and genetic algorithm , Journal of the Nigerian Society of Physical Sciences: Volume 6, Issue 4, November 2024
- G. R. Venkatakrishnan, R. Rengaraj, K. K. Sathish, R. K. Dinesh, T. Nishanth, Implementation of Modified Differential Evolution Algorithm for Hybrid Renewable Energy System , Journal of the Nigerian Society of Physical Sciences: Volume 3, Issue 3, August 2021
- Emmanuel Adeyefa, O. S. Esan, Exponentially Fitted Chebyshev Based Algorithm as Second Order Initial Value Solver , Journal of the Nigerian Society of Physical Sciences: Volume 2, Issue 1, February 2020
- Akila Dabara Kayit, Mohd Tahir Ismail, Novel way to predict stock movements using multiple models and comprehensive analysis: leveraging voting meta-ensemble techniques , Journal of the Nigerian Society of Physical Sciences: Volume 6, Issue 3, August 2024
- K. M. Omatola, A. D. Onojah, R. Larayetan, A. O. Ohiani, I. I. Oshatuyi, M. B. Ochang, O. Anawo, P. Abraham, Isolation and investigation of the structure of silicon quantum dots from rice husk ultrafine silica for possible applications in nanoelectromechanical systems , Journal of the Nigerian Society of Physical Sciences: Volume 7, Issue 4, November 2025
- Isah Charles Saidu, Musa Yusuf, Florence Chukwuemeka Nemariyi, Ayenopwa Comfort George, Indexing techniques and structured queries for relational databases management systems , Journal of the Nigerian Society of Physical Sciences: Volume 6, Issue 4, November 2024
- J. A. Adebisi, O. M. Babatunde, Green Information and Communication Technologies Implementation in Textile Industry Using Multicriteria Method , Journal of the Nigerian Society of Physical Sciences: Volume 4, Issue 2, May 2022
- Xiaojie Zhou, Majid Khan Majahar Ali, Farah Aini Abdullah, Lili Wu, Ying Tian, Tao Li, Kaihui Li, Air quality prediction enhanced by a CNN-LSTM-Attention model optimized with an advanced dung beetle algorithm , Journal of the Nigerian Society of Physical Sciences: Volume 7, Issue 3, August 2025
- O. Oderinde, C. L. Mgbechidinma, A. O. Agbeja, A. A. Ajayi, A. O. Ogundiran, O. O. Olaide, O. A. Orelaja, C. A. Mgbechidimma, C. O. Ajanaku, K. D. Oyeyemi, Appraising raw exhaust pollutant gases emissions from industrial generators using statistics and machine learning approaches , Journal of the Nigerian Society of Physical Sciences: Volume 7, Issue 4, November 2025
- Stephen Olushola Oladosu, Alfred Sunday Alademomi , James Bolarinwa Olaleye, Joseph Olalekan Olusina, Tosin Julius Salami, Evaluation of ANFIS Predictive Ability Using Computed Sediment from Gullies and Dam , Journal of the Nigerian Society of Physical Sciences: Volume 5, Issue 2, May 2023
You may also start an advanced similarity search for this article.

