Geographically weighted regression random forest for modeling soil particles

Authors

  • Atiek Iriany
    Department of Statistics, Faculty of Mathematics and Natural Science, Brawijaya University, Malang, 65144, Indonesia
  • Wigbertus Ngabu
    Mathematics Education Study Program, University of Riau, Pekanbaru, 28293, Indonesia
    https://orcid.org/0000-0002-7153-7310
  • Henny Pramoedyo
    Department of Statistics, Faculty of Mathematics and Natural Science, Brawijaya University, Malang, 65144, Indonesia
  • Amarifai
    Department of Statistics, Faculty of Mathematics and Natural Science, Brawijaya University, Malang, 65144, Indonesia

Keywords:

Clay particles, GWR, GWRRF, Hybrid model, Random forest

Abstract

Clay particles play a vital role in determining soil quality, particularly in the fields of agriculture and conservation. However, the complex and non-linear spatial distribution of clay particles is difficult to capture using conventional modeling methods. This study aims to develop a hybrid model, Geographically Weighted Regression Random Forest (GWRRF), which combines the ability of Geographically Weighted Regression (GWR) to capture spatial heterogeneity and the strength of Random Forest (RF) in handling non-linear relationships. The data used in this study were derived from soil texture and local morphologic analysis across 50 observation points in the Kalikonto watershed. The results show that the GWRRF model provides more accurate predictions of clay particle content than the GWR model, with an R² value of 0.735 and a lower RMSE. In contrast, the GWR model achieved an R² value of 0.575. The integration of both methods in GWRRF offers a novel, adaptive, and context-aware approach to understanding the distribution of clay particles, contributing to more precise and sustainable data-driven land management.

Dimensions

[1] K. Yin, A. Fauchille, E. Di Filippo, P. Kotronis & G. Sciarra, “A review of sand-clay mixture and soil-structure interface direct shear test”, Geotechnics 1 (2021) 260. https://doi.org/10.3390/geotechnics1020014.

[2] Y. Li, E. Padoan & F. Ajmone-Marsan, “Soil particle size fraction and potentially toxic elements bioaccessibility: A review”, Ecotoxicology and Environmental Safety 209 (2021) 111806. https://doi.org/10.1016/j.ecoenv.2020.111806.

[3] F. J. Matus, “Fine silt and clay content is the main factor defining maximal C and N accumulations in soils: a meta-analysis”, Scientific Reports 11 (2021) 6438. https://doi.org/10.1038/s41598-021-84821-6

[4] D. Kim, B. H. Nam & H. Youn, “Effect of clay content on the shear strength of clay-sand mixture”, International Journal of Geo-Engineering 9 (2018) 19. https://doi.org/10.1186/s40703-018-0087-x

[5] J. Zhang, J. E. Amonette & M. Flury, “Effect of biochar and biochar particle size on plant-available water of sand, silt loam, and clay soil”, Soil and Tillage Research 212 (2021) 104992. https://doi.org/10.1016/j.still.2021.104992.

[6] H. Pramoedyo, W. Ngabu, S. Riza & A. Iriany, “Spatial analysis using geographically weighted ordinary logistic regression (GWOLR) method for prediction of particle-size fraction in soil surface”, in IOP Conference Series: Earth and Environmental Science, vol. 1299, no. 1, IOP Publishing, 2024, pp. 012005. https://doi.org/10.1088/1755-1315/1299/1/012005.

[7] H. Yu, A. S. Fotheringham, Z. Li, T. Oshan, W. Kang & L. J. Wolf, “Inference in multiscale geographically weighted regression”, Geographical Analysis 52 (2020) 87. https://doi.org/10.1111/gean.12189

[8] L. Chao, K. Zhang, Z. Li, Y. Zhu, J. Wang & Z. Yu, “Geographically weighted regression based methods for merging satellite and gauge precipitation”, Journal of Hydrology 558 (2018) 275. https://doi.org/10.1016/j.jhydrol.2018.01.038.

[9] Y. Gao, J. Zhao & L. Han, “Exploring the spatial heterogeneity of urban heat island effect and its relationship to block morphology with the geographically weighted regression model”, Sustainable Cities and Society 76 (2022) 103431. https://doi.org/10.1016/j.scs.2022.103431.

[10] S. Georganos & S. Kalogirou, “A forest of forests: a spatially weighted and computationally efficient formulation of geographical random forests”, ISPRS International Journal of Geo-Information 11 (2022) 471. https://doi.org/10.3390/ijgi11090471.

[11] J. L. Speiser, M. E. Miller, J. Tooze & E. Ip, “A comparison of random forest variable selection methods for classification prediction modeling”, Expert Systems with Applications 134 (2019) 93. https://doi.org/10.1016/j.eswa.2019.05.048.

[12] A. Iriany, W. Ngabu, D. Ariyanto & H. Pramoedyo, “Kriging prediction and simulation model: analysis of surface soil particle size distribution”, Mathematical Modelling of Engineering Problems 12 (2025) 408. https://doi.org/10.18280/mmep.120408.

[13] W. Ngabu, R. Fitriani, H. Pramoedyo & A. B. Astuti, “Cluster fast double bootstrap approach with random effect spatial modeling”, BAREKENG: Jurnal Ilmu Matematika dan Terapan 17 (2023) 0945. https://doi.org/10.24123/barekeng.v17i2.900.

[14] F. Deng, W. Liu, M. Sun, Y. Xu, B. Wang, W. Liu et al., “Fine estimation of water quality in the yangtze river basin based on a geographically weighted random forest regression model”, Remote Sensing 17 (2025) 731. https://doi.org/10.3390/rs17040731.

[15] P. A. Shary, “Land surface in gravity points classification by a complete system of curvatures”, Mathematical Geology 27 (1995) 373. https://doi.org/10.1007/BF02101691.

[16] M. Charlton, S. Fotheringham & C. Brunsdon, Geographically weighted regression, White paper. National Centre for Geocomputation. National University of Ireland Maynooth, 2009, vol. 2. Available online: https://www.ncge.ie/wp-content/uploads/2019/01/GWR_whitepaper.pdf.

[17] C. Brunsdon, S. Fotheringham & M. Charlton, “Geographically weighted regression”, Journal of the Royal Statistical Society: Series D (The Statistician) 47 (1998) 431. https://doi.org/10.1111/1467-9884.00142.

[18] D. C. Wheeler, “Geographically weighted regression”, in Handbook of Regional Science, Springer, 2021, pp. 1895–1921. https://doi.org/10.1007/978-3-642-23416-3_143.

[19] A. Iriany, W. Ngabu & D. Ariyanto, “Rainfall modeling using the geographically weighted poisson regression method”, BAREKENG: Jurnal Ilmu Matematika dan Terapan 18 (2024) 0627. https://doi.org/10.24123/barekeng.v18i1.914.

[20] M. Van Wezel & R. Potharst, “Improved customer choice predictions using ensemble methods”, European Journal of Operational Research 181 (2007) 436. https://doi.org/10.1016/j.ejor.2006.07.043.

[21] L. Breiman, “Random forests”, Mach Learn 45 (2001) 5. Available online: https://link.springer.com/article/10.1023/A:1010933404324.

[22] A. Sekulić, M. Kilibarda, G. Heuvelink, M. Nikolić & B. Bajat, “Random forest spatial interpolation”, Remote Sensing 12 (2020) 1687. https://doi.org/10.3390/rs12101687.

[23] N. H. A. Malek, W. F. W. Yaacob, Y. B. Wah, S. A. Md Nasir, N. S. Shaadan & S. W. Indratno, “Comparison of ensemble hybrid sampling with bagging and boosting machine learning approach for imbalanced data”, Indones. J. Elec. Eng. Comput. Sci 29 (2023) 598. https://doi.org/10.11591/ijeecs.v29.i3.598-608.

[24] A. Liaw & M. Wiener, “Classification and regression by randomForest”, R News 2 (2002) 18. Available online: https://cran.r-project.org/doc/Rnews/Rnews_2002-3.pdf.

[25] A. B. Shaik & S. Srinivasan, “A brief survey on random forest ensembles in classification model”, in International Conference on Innovative Computing and Communications: Proceedings of ICICC 2018, Volume 2, Springer, 2019, pp. 253–260. https://doi.org/10.1007/978-981-15-0232-3_23.

[26] T. Hengl, M. Nussbaum, M. N. Wright, G. M. Heuvelink & B. Gräler, “Random forest as a generic framework for predictive modeling of spatial and spatio-temporal variables”, PeerJ 6 (2018) e5518. https://doi.org/10.7717/peerj.5518.

[27] F. Santos, V. Graw & S. Bonilla, “A geographically weighted random forest approach for evaluate forest change drivers in the Northern Ecuadorian Amazon”, PloS one 14 (2019) e0226224. https://doi.org/10.1371/journal.pone.0226224.

[28] B. P. O. Lovatti, M. H. C. Nascimento, Á. C. Neto, E. R. Castro & P. R. Filgueiras, “Use of Random forest in the identification of important variables”, Microchemical Journal 145 (2019) 1129. https://doi.org/10.1016/j.microc.2018.10.035.

[29] D. Denisko & M. M. Hoffman, “Classification and interaction in random forests”, Proceedings of the National Academy of Sciences 115 (2018) 1690. https://doi.org/10.1073/pnas.1722310115.

[30] Z. Sun, G. Wang, P. Li, H. Wang, M. Zhang & X. Liang, “An improved random forest based on the classification accuracy and correlation measurement of decision trees”, Expert Systems with Applications 237 (2024) 121549. https://doi.org/10.1016/j.eswa.2023.121549.

[31] J. Hu & S. Szymczak, “A review on longitudinal data analysis with random forest”, Briefings in bioinformatics 24 (2023) bbad002. https://doi.org/10.1093/bib/bbad002.

[32] Z. Chen, S. Zhang, W. Geng, Y. Ding & X. Jiang, “Use of geographically weighted regression (GWR) to reveal spatially varying relationships between Cd Accumulation and soil properties at field scale”, Land 11 (2022) 635. https://doi.org/10.3390/land11050635.

Published

2026-03-14

How to Cite

Geographically weighted regression random forest for modeling soil particles. (2026). Journal of the Nigerian Society of Physical Sciences, 8(2), 2939. https://doi.org/10.46481/jnsps.2026.2939

Issue

Section

Mathematics & Statistics

How to Cite

Geographically weighted regression random forest for modeling soil particles. (2026). Journal of the Nigerian Society of Physical Sciences, 8(2), 2939. https://doi.org/10.46481/jnsps.2026.2939

Similar Articles

11-20 of 215

You may also start an advanced similarity search for this article.