The E ﬃ ciency of the K-L Estimator for the Seemingly Unrelated Regression Model: Simulation and Application

This paper considers the Ridge Feasible Generalized Least Squares Estimator (RFGLSE), Ridge Seemingly Unrelated Regression R SUR and proposes the Kibria-Lukman KL SUR estimator for the parameters of the Seemingly Unrelated Regression (SUR) model when the regressors of the models are collinear. A simulation study was conducted to compare the performance of the three di ﬀ erent types of estimators for the SUR model. Di ﬀ erent correlation levels (0 . 0 , 0 . 1 , 0 . 2 , ··· , 0 . 9) among the independent variables, sample sizes replicated 10000 times and contemporaneous error correlation (0 . 0 , 0 . 1 , 0 . 2 , ··· , 0 . 9) among the equations were assumed for the simulation study. The e ﬃ ciency of the three (RFGLSE, R SUR , and KL SUR estimators for SUR, when the predictors are correlated, was investigated using the Trace Mean Square Error (TMSE). The results showed that the KL SUR estimator outperformed the other estimators except for a few cases when the sample size is small.


Introduction
One of the most ingenious and groundbreaking research in econometrics is combining different single equations into a system of equations to improve the efficiency of the parameter estimation [1], [2]. The Seemingly unrelated regression (SUR) model with M equations and T observations is given as, where y i is an nm × 1 vector of observations on the i th response variable, X i is a fixed mn × k i matrix of explanatory variables, β i is a k i × 1 vector of unknown regression parameters, ε i is an nm × 1 vector of disturbances such that cov(ε) = E [ε ε] ⊗ I n , E(ε) = 0.
The Ordinary Least Squares (OLS) estimator is widely used to estimate the unknown regression parameters one equation 1 Alaba & Kibria / J. Nig. Soc. Phys. Sci. 5 (2023) 1514 2 at a time since each equation is a classical regression. The SUR estimator simultaneously captures different regression equations. However, the efficiency gain in SUR is premised on the high level of contemporaneous correlation between or among each classical regression equation. Notable works on the efficiency gained in SUR which takes cognizance of the contemporaneous correlation of error terms in the joint equations include [3], [4], [5], [6], [7], [8], [9], [10], [11] among others. The Generalized Least Squares Estimator (GLSE) is used to estimate the variance-covariance matrix of the disturbances in SUR model.
It is a statistical fallacy to assume that the relationship between or among explanatory variables plays an insignificant effect on the error structure of the model. The severity of the correlation levels among the predictors can affect the efficiency and sensitivity of the estimators [12], [13]. Hence, the variance of the estimator is inflated, unreliable inference and the confidence interval due to multicollinearity is wider which may increase the probability of a type-II error in hypothesis testing of unknown parameters [14]. Numerous research on single equation models when the problem of multicollinearity is inherent are available in literature. Notable works include [15], [16], [17], [18], [19], [20], [21], [22] among others. Studies on multicollinearity on systems of equations of regression models are still lacking or scarce in literature. However, a notable exception is [23], [24].
Recently, shrinkage estimators such as ridge regression estimators have gained attraction among researchers such that quite a number of exciting estimators emerged [17], [25], [26]. [19] proposed the K-L estimator to tackle the correlated regressors problem for the classical linear regression model, which outperformed both the Generalized Least Squares Estimator (GLSE) and Ridge Feasible Generalized Least Squares Estimator (RFGLSE). The objective of this paper is to develop an estimator which is suitable for the joint modelling of the K-L estimator, Kibria-Lukman Seemingly Unrelated Regression K −L S UR estimator when the predictors are correlated as well as compare the newly developed estimator with the existing Ridge Seemingly Unrelated Regression estimator (R S UR ) and Ridge Feasible Generalized Least Squares Estimator (RFGLSE).
The organization of the paper is as follows: The estimators and their Trace Mean Square Error (TMSE) expressions are given in Section 2. A simulation study is presented in Section 3. To illustrate the findings of the paper, real-life data are analysed in Section 4. The paper ends with some concluding remarks in Section 5.

Statistical Methodology
The GLSE is given aŝ The ridge parameter estimator is an important tool when explanatory variables are correlated in classical linear regression analysis. The ridge estimation technique was pioneered by [15] and extended to the SUR model by [25], [26], [27]. The ridge estimator for GLSE is given as: where G is a k × k matrix of non-negative elements characterizing the estimator. To circumvent the problem that σ is unknown, a Ridge Feasible Generalized Least Squares estimator (RFGLSE) is given as: whereβ FGLS E = X S −1 ⊗ I X −1 X S −1 ⊗ I y andψ −1 = S −1 ⊗ I From (1), given Λ as the diagonal matrix of the eigenvalues and ψ a matrix whose columns are eigenvectors of X * X * of the systems of equations, SUR. The canonical version of (1) is defined as where Z * = X * ψ, α * = ψ β and Z * Z * = ψ X * X * ψ = Λ The GLSE of SUR for α * is; The ridge regression estimator is; using(4)and(7) Following [19], we define the K-LSUR estimator as follows: The K-L estimator is an unbiased estimator when k = 0

Numerical Analysis
A simulation study is considered to compare the performance of the estimators in this section. It consists of two parts (i) Simulation study (ii) Discussion of Results.

Simulation Study
The Monte Carlo experiment was performed by generating data according to the following algorithm.    3. The variance-covariance is given as 4. Simulate the vector random error from MV N 3 (0, Σ e ) 5. For a given X structure, transform the original model to the canonical form.

Discussion of Results
From Tables 2 to 13 and Figures 1 to 4, we can see that the proposed KL S UR estimator uniformly dominate the RFGLS and R S UR estimator except when the sample size is small (n = 20). Also, an increase in the ρ x i x i increases the estimated TMSE values of the estimators. Obviously, the TMSE values were significantly large at sample size n = 20 when ρ ε M = 0.7, 0.8 and 0.9 than other corresponding TMSE values with n = 30, 50, 100 for ρ ε M = {0.0, 0.1, 0.2, · · · , 0.9} and n = 20 Table 5. Estimated TMSEs for the Different Methods when ρ ε M = 0.1, 0.2 and 0.3 at n = 100   Figures 1 to 4 showed that as the value of ρ increases, TMSE also increase, while as the sample size increases, the TMSE value decreases.
The preferred TMSE values were at n =100 for ρ ε M = 0.1 for KL S UR estimator. KL S UR produced the smallest TMSE values when ρ x i x i range from 0.1 to 0.9, such that the corresponding TMSE values for KL S UR estimator increase as the ρ x i x i increase. However, it was noted that the ideal and smallest TMSE value for KL S UR estimator occurred at ρ ε M = 0.1 and ρ x i x i = 0.1. Concisely, KL S UR also produced small TMSE values at ρ ε M = 0.1 and ρ x i x i = 0.1, but not smaller in comparison to the one produced by KL S UR estimator. Con- As T increases, such that ρ ε M and ρ x i x i decreases (or relatively low), the TMSE values of KL S UR decreases. This implies that there is a gain in the efficiency of the KL S UR estimator.

Application
To further illustrate the results of the theoretical part of this paper, we consider the dataset and structural model by  [12]. The study considered the foreign direct investment on i denotes countries (i= TUR, ZAF, BRA, IND, IDN).
Fdi i is the Foreign Direct Investment of the "fragile five" countries, Gro is the GDP per Capital growth, Df is the GDP deflator, Cab is the Current Account Balance, Fc is the General Government Final Consumption Expenditure, Ips is the Imports of Goods and Services, Prr is the Personal Remittances Received, Tr is the Total Reserves and Egs is the Exports of Goods and Services. The assumptions in each equation and joint model were put into consideration. Normality, homoscedasticity, multicollinearity and serial correlation of the error term were examined. The results are available in Tables  14 to 18.
The null hypothesis for the Durbin-Watson test is that errors are random and independent. A significant p-value in this test rejects the null hypothesis that the time series is not autocorrelated. Table 5 suggests a rejection of the null hypothesis for Turkey at the significance level, that is, p-value = 0.0032 ≺0.05. This implies that each equation satisfied the assumption of non-autocorrelation. 8 The null hypothesis for the Breusch-Pagan Test is that there is no homoscedasticity (that is, there is presence of heteroscedasticity). Since, the p-value for each of the "fragile five" country is greater than 0.05. We did not reject the null hypothesis in each equation, so the assertion of homoscedasticity in each equation is satisfied.
The Variance Inflation Factor (VIF) makes it possible to measure how many times the variance of the regression coefficients will be for multicollinear data than for orthogonal/ canonical data. If VIF 10 this indicates multicollinearity. Concisely, deflator, current account, total reserves; import and export of goods posed to be problematic (that is, multicollinearity problem exist) as their VIF values were strictly greater than 10, while others also call for little concern. This implies that the problem of multicollinearity exists in the equations. Therefore, the KL S UR estimator will be ideal to solve the problem.
The regression equation specification error test is meant for testing the exogeneity of explanatory variables. The null hypothesis for the test is that there is no correlation between the error term and the explanatory variables or that linearity exist in the functional form of the regression model, that is, E [ε i | X i ] = 0. Since, the p-value for each of the "Fragile Five" country is greater than 0.05 in table 14, this suggests that there is no correlation between the error term and the explanatory variables for all the countries (Turkey, Brazil, India, Indonesia, and South Africa).
The shrinkage parameter estimator is designed for model reduction for each of the M equations in order to absolve only significant explanatory variables in line with their number of observations. The shrinkage estimator makes it possible to standardize each reduced equation. From Table 18, the shrinkage parameters of the KL S UR estimator via the TMSE produced the smallest TMSE value of 35.00 compared to RFGLS and R S UR that gave higher TMSE of 40.52786 and 40.5278 respectively. This makes the KL S UR estimator to possess a higher efficiency gain than the two other estimators. Summarily, KL S UR = 35.00 indicates that the estimator outperforms RFGLS and R S UR .

Conclusion
The seemingly unrelated regression model is an exciting and celebrated model when the error structure of joint models are correlated. We considered the ridge feasible generalized least squares estimator, ridge seemingly unrelated regression and the Kibria-Lukman KL S UR estimator for estimating the parameters of the seemingly unrelated regression model when the regressors of the model are collinear. Findings from this comparative study revealed that KL S UR estimator is a better alternative when compared with ridge and ridge feasible generalized least squares, which are often used to tackle the problem of multicollinearity in single and joint equations respectively. Simulation and real-life data were used for assessment. Note that we have considered one of many possible estimators of the ridge parameter k. The conclusion of the paper may change if we consider different values of k and such possibility is under the current investigation.