Robust hybrid algorithms for regularization and variable selection in QSAR studies
Keywords:
High dimension, QSAR, Multicollinearity, Outliers, Sparse Least trimmed squares, Random forestAbstract
This study introduces a robust hybrid sparse learning approach for regularization and variable selection. This approach comprises two distinct steps. In the initial step, we segment the original dataset into separate training and test sets and standardize the training data using its mean and standard deviation. We then employ either the LASSO or sparse LTS algorithm to analyze the training set, facilitating the selection of variables with non-zero coefficients as essential features for the new dataset. Secondly, the new dataset is divided into training and test sets. The training set is further divided into k folds and evaluated using a combination of Random Forest, Ridge, Lasso, and Support Vector Regression machine learning algorithms. We introduce novel hybrid methods and juxtapose their performance against existing techniques. To validate the efficacy of our proposed methods, we conduct a comprehensive simulation study and apply them to a real-life QSAR analysis. The findings unequivocally demonstrate the superior performance of our proposed estimator, with particular distinction accorded to SLTS+LASSO. In summary, the twostep robust hybrid sparse learning approach offers an effective regularization and variable selection applicable to a wide spectrum of real-world problems.
Published
How to Cite
Issue
Section
Copyright (c) 2023 Adewale F. Lukman, Christian N. Nwaeme

This work is licensed under a Creative Commons Attribution 4.0 International License.
How to Cite
Similar Articles
- Shaymaa Mohammed Ahmed, Majid Khan Majahar Ali, Raja Aqib Shamim, Integrating robust feature selection with deep learning for ultra-high-dimensional survival analysis in renal cell carcinoma , Journal of the Nigerian Society of Physical Sciences: Volume 7, Issue 4, November 2025
- Philemon Uten Emmoh, Christopher Ifeanyi Eke, Timothy Moses, A feature selection and scoring scheme for dimensionality reduction in a machine learning task , Journal of the Nigerian Society of Physical Sciences: Volume 7, Issue 1, February 2025
- Nour Hamad Abu Afouna, Majid Khan Majahar Ali, Optimizing precision farming: enhancing machine learning efficiency with robust regression techniques in high-dimensional data , Journal of the Nigerian Society of Physical Sciences: Volume 7, Issue 1, February 2025
- George Muddu, Shefiu Olusegun Ganiyu, Adekunle Olugbenga Ejidokun, Yusuf Abass Aleshinloye, Integrated data-driven credit default prediction in Uganda using machine learning models , Journal of the Nigerian Society of Physical Sciences: Volume 8, Issue 1, February 2026
- Timothy Kayode Samson, Francis Olatunbosun Aweda, Wind speed prediction in some major cities in Africa using Linear Regression and Random Forest algorithms , Journal of the Nigerian Society of Physical Sciences: Volume 6, Issue 4, November 2024
- Rauf I. Rauf, Ayinde Kayode, Bello A. Hamidu, Bodunwa O. Kikelomo, Alabi O. Olusegun, Enhanced methods for multicollinearity mitigation in stochastic frontier analysis estimation , Journal of the Nigerian Society of Physical Sciences: Volume 6, Issue 4, November 2024
- Gabriel James, Anietie Ekong, Etimbuk Abraham, Enobong Oduobuk, Peace Okafor, Analysis of support vector machine and random forest models for predicting the scalability of a broadband network , Journal of the Nigerian Society of Physical Sciences: Volume 6, Issue 3, August 2024
- Segun L. Jegede, Adewale F. Lukman, Kayode Ayinde, Kehinde A. Odeniyi, Jackknife Kibria-Lukman M-Estimator: Simulation and Application , Journal of the Nigerian Society of Physical Sciences: Volume 4, Issue 2, May 2022
- Hamza Abubakar, Abdu Sagir Masanawa, Surajo Yusuf, G. I. Boaku, Optimal representation to High Order Random Boolean kSatisability via Election Algorithm as Heuristic Search Approach in Hopeld Neural Networks , Journal of the Nigerian Society of Physical Sciences: Volume 3, Issue 3, August 2021
- Nahid Salma, Majid Khan Majahar Ali, Raja Aqib Shamim, Machine learning-based feature selection for ultra-high-dimensional survival data: a computational approach , Journal of the Nigerian Society of Physical Sciences: Volume 7, Issue 3, August 2025
You may also start an advanced similarity search for this article.
Most read articles by the same author(s)
- Segun L. Jegede, Adewale F. Lukman, Kayode Ayinde, Kehinde A. Odeniyi, Jackknife Kibria-Lukman M-Estimator: Simulation and Application , Journal of the Nigerian Society of Physical Sciences: Volume 4, Issue 2, May 2022

