Model Fitness and Predictive Accuracy in Linear Mixed-Effects Models with Latent Clusters

Authors

  • Waheed B. Yahya Department of Statistics, , University of Ilorin, P.M.B. 1515, Ilorin, Kwara State
  • Yusuf Bello Department of Mathematical Sciences, Federal University, Dutsin-Ma
  • Abdulrazaq AbdulRaheem Department of Statistics and Mathematical Sciences, Kwara State University, Malete, P.M.B. 1530, Ilorin, Kwara State, Nigeria.

Keywords:

Clustered data, Primal and dual clusters, Linear mixed-effects models, Model fitness, Predictive accuracy

Abstract

In clustered data, observations within a cluster show similarity between themselves because they share common features different from observations in the other clusters. In a given population, different clustering may surface because correlation may occur across more than one dimension. The existing multilevel analysis techniques of the primal linear mixed-effect models are limited to natural clusters which are often not realistic to capture in real-life situations. Therefore, this paper proposes dual linear mixed models (DLMMs) for modeling unobserved latent clusters when such are present in data sets to yield appreciable gains in model fitness and predictive accuracy. The methodology explored the development and analysis of the dual linear mixed models (DLMMs) based on the derived latent clusters from the natural clusters using multivariate cluster analysis. A published data set on political analysis was used to demonstrate the efficiency of the proposed models. The proposed DLMMs have yielded minimum values of the models' assessment criteria (Akaike information criterion, Bayesian information criterion, and root mean squared error), and hence, outperformed the classical PLMMs in terms of model fitness and predictive accuracy.

Dimensions

O. S. Adesina, “Bayesian Multilevel Models for Count Data”, Journal of the Nigerian Society of Physical Sciences 3 (2021) 224. doi:10.46481/jnsps.2021.168

N. M. Laird & J. H. Ware, “Random-effects models for longitudinal data”, Biometrika (1982) 963.

A. Abadie, S. Athey, G. W. Imbens & J. Wooldridge, “When should you adjust standard errors for clustering?” Available at https://economics.mit.edu/files/13927 Deposited (2017).

Y. Bello, S.U. Gulumbe & S. A. Yelwa, “Simultaneous application of agglomerative algorithms on interval measures for better classification of crime Data across the ctates in Nigeria”, Research Journal of Applied Science 7 (2012) 41. doi: 10.3923/rjasci.2012.41.47

W. H¨ardle & Z. Hl´avka, Multivariate statistics: Exercises and solutions, Springer-Verlag, New York, 2007.

B. Villarroel, G. Marshall & Baron, A., “Cluster analysis using multivariate mixed effects models”, Statistics in Medicine 28 (2009) 2552. doi: 10.1002/sim.3632

G. Celeux, O. Martin & C. Lavergne, “Mixture of linear mixed models for clustering gene expression profiles from repeated microarray experiments”, Statistical Modelling 5 (2005) 1. doi.org/10.1191/1471082X05st096oa

R. L. Wears, “Advanced statistics: Statistical methods for analyzing cluster and cluster-randomized Data”, Academic Emergency Medicine 9 2002, doi.org/10.1197/aemj.9.4.330

O. J. Ibidoja, F. P. Shan, Mukhtar, J. Sulaiman & M. K. M. Ali “Robust MEstimators and Machine Learning Algorithms for Improving the Predictive Accuracy of Seaweed Contaminated Big Data”, Journal of the Nigerian Society of Physical Sciences 9 (2023) 1137. doi:10.46481/jnsps.2022.1137

R. Lehtonen, C. S¨arndal & A. Veijanen, “The Effect of model choice in estimation for domains, including small domains”, Statistics Canada 29 (2003) 33.

H. Akaike, “Information theory and an extension of the maximum likelihood principle”, In International Symposium on Information Theory, Ed. B. N. Petrov and F. Csaki, pp. 267-81. Budapest: Akademia Kiado (1973).

F. Vaida & S. Blanchard, “Conditional Akaike information for mixed-effects models”, Biometrika 92 (2005) 351.

doi.org/10.1093/biomet/92.2.351

S. M¨uller, J. Scealy & A. Welsh, “Model selection in linear mixed models”, Statistical Science 28 (2013) 135. doi.org/10.1007/s10182-01900359-z

A. F. Zuur, E. N. Ieno, N.J. Walker, A. A. Saveliev & G.M. Smith, Mixed effects models and extensions in Ecology with R, Springer Science+Business Media New York, 2009.

B. T. West, K. B. Welch & A. T. Galecki, Linear Mixed Models: A Practical guide using statistical software, Chapman & Hall/CRC Boca Raton, 2007.

J. De Leeuw, E. Meijer & H. Goldstein, Handbook of multilevel analysis, Springer New York (2008).

I. Ercanli, A. Gunlu & E. Z. Bas,kent, “Mixed effect models for predicting breast height diameter from stump diameter of Oriental beech in G¨olda?g”, Scientia Agricola (2014).

S. Greven & T. Kneib, “On the behavior of marginal and conditional Akaike information criteria in linear mixed models”, Johns Hopkins University, Department of Biostatistics Working Papers, Paper 179. http://www.bepress.com/jhubiostat/paper179/ Deposited (2008).

W. H¨ardle & L. Simar, Applied multivariate statistical analysis, 2nd edition, Springer-Verlag New York (2003).

J. H.Ward, “Hierarchical grouping methods to optimize an objective function”, Journal of the American Statistical Association 58 (1963) 236. doi.org/10.2307/2282967

R. Elgie, C. Bucur, B. Dolez & A. Laurent, “Proximity, candidates, and presidential power: How directly elected presidents shape the legislative party system”, Political Research Quarterly 67 (2014) 467.

doi.org/10.1177/1065912914530514

Published

2023-06-11

How to Cite

Model Fitness and Predictive Accuracy in Linear Mixed-Effects Models with Latent Clusters. (2023). Journal of the Nigerian Society of Physical Sciences, 5(3), 1437. https://doi.org/10.46481/jnsps.2023.1437

Issue

Section

Original Research

How to Cite

Model Fitness and Predictive Accuracy in Linear Mixed-Effects Models with Latent Clusters. (2023). Journal of the Nigerian Society of Physical Sciences, 5(3), 1437. https://doi.org/10.46481/jnsps.2023.1437