A feature selection and scoring scheme for dimensionality reduction in a machine learning task
Keywords:
Algorithm, Dataset, Dimensionality reduction, Feature selectionAbstract
Selection of important features is very vital in machine learning tasks involving high-dimensional dataset with large features. It helps in reducing the dimensionality of a dataset and improving model performance. Most of the feature selection techniques have restriction in the kind of dataset to be used. This study proposed a feature selection technique that is based on statistical lift measure to select important features from a dataset. The proposed technique is a generic approach that can be used in any binary classification dataset. The technique successfully determined the most important feature subset and outperformed the existing techniques. The proposed technique was tested on lungs cancer dataset and happiness classification dataset. The effectiveness of the proposed technique in selecting important features subset was evaluated and compared with other existing techniques, namely Chi-Square, Pearson Correlation and Information Gain. Both the proposed and the existing techniques were evaluated on five machine learning models using four standard evaluation metrics such as accuracy, precision, recall and F1-score. The experimental results of the proposed technique on lung cancer dataset shows that logistic regression, decision tree, adaboost, gradient boost and random forest produced a predictive accuracy of 0.919%, 0.935%, 0.919%, 0.935% and 0.935% respectively, and that of happiness classification dataset produced a predictive accuracy of 0.758%, 0.689%, 0.724%, 0.655% and 0.689% on random forest, k-nearest neighbor, decision tree, gradient boost and cat boost respectively, which outperformed the existing techniques.
Published
How to Cite
Issue
Section
Copyright (c) 2024 Philemon Uten Emmoh, Christopher Ifeanyi Eke, Timothy Moses

This work is licensed under a Creative Commons Attribution 4.0 International License.
How to Cite
Similar Articles
- Olalekan Taofeek Wahab, Salaudeen Alaro Musa, AbdulAzeez Kayode Jimoh, Kazeem Adesina Dauda, Constructive approach and randomization of a two-parameter chaos system for securing data , Journal of the Nigerian Society of Physical Sciences: Volume 6, Issue 2, May 2024
- Elsayed Elshoubary, Effect of reduction method on the performance a software defined network system using Gumbel Hougaard family copula distribution , Journal of the Nigerian Society of Physical Sciences: Volume 5, Issue 4, November 2023
- Raphael Ozighor Enihe, Rajesh Prasad, Francisca Nonyelum Ogwueleka, Fatimah Binta Abdullahi, The effect of imbalance data mitigation techniques on cardiovascular disease prediction , Journal of the Nigerian Society of Physical Sciences: Volume 7, Issue 2, May 2025
- Peter Chibuike Okoye, Samuel Ogochukwu Azi, Taoreed O. Owolabi, Perovskite tetragonality modeling for functional properties enhancement using Newtonian search based support vector regression computational method , Journal of the Nigerian Society of Physical Sciences: Volume 4, Issue 1, February 2022
- John Tulirinya, Mathew Kinyanju, Samuel Mutua, Asaph Muhumuza, Optimizing initial chlorine dosage at an injection point along a water distribution pipe , Journal of the Nigerian Society of Physical Sciences: Volume 7, Issue 4, November 2025
- Shola Adebiyi, Isaac Adimula, Olushola Oladipo, Inter-Hemispheric Comparison of Ionospheric TEC Variation at Each Latitudinal Band During Quiet Geomagnetic Condition , Journal of the Nigerian Society of Physical Sciences: Volume 2, Issue 2, May 2020
- K. O. Sodeinde, S. O. Olusanya, D. U. Momodu, V. F. Enogheghase, O. S. Lawal, Waste glass: An excellent adsorbent for crystal violet dye, Pb2+ and Cd2+ heavy metals ions decontamination from wastewater , Journal of the Nigerian Society of Physical Sciences: Volume 3, Issue 4, November 2021
- V. J. Shaalini, S. E. Fadugba, A New Multi-Step Method for Solving Delay Differential Equations using Lagrange Interpolation , Journal of the Nigerian Society of Physical Sciences: Volume 3, Issue 3, August 2021
- Olumide S. Adesina, Adedayo F. Adedotuun, Kayode S. Adekeye, Ogbu F. Imaga, Adeleke J. Adeyiga, Toluwalase J. Akingbade, On logistic regression versus support vectors machine using vaccination dataset , Journal of the Nigerian Society of Physical Sciences: Volume 6, Issue 1, February 2024
- Dlal Bashir, Hailiza Kamarulhaili, Olayiwola Babarinsa, A Review on Quadrant Interlocking Factorization: WZ and WH Factorization , Journal of the Nigerian Society of Physical Sciences: Volume 5, Issue 1, February 2023
You may also start an advanced similarity search for this article.

