Assessing model selection techniques for distributions use in hydrological extremes in the presence of trimming and subsampling

Authors

  • Sunday Samuel Bako Department of Mathematical Sciences, Kaduna State University, Kaduna, Nigeria
  • Norhaslinda Ali Department of Mathematics and Statistics, Universiti Putra Malaysia, Selangor, Malaysia
  • Jayanthi Arasan Department of Mathematics and Statistics, Universiti Putra Malaysia, Selangor, Malaysia

Keywords:

Flood frequency analysis, Model selection, Trimming, Subsampling

Abstract

A suitable probability distribution is required to quantify and estimate hydraulic structure design for risk evaluation and management. The inability of model selection criteria to differentiate, in some cases, among candidate distributions used in the analysis of hydrological extremes is often criticised. This study verifies, with the aid of model selection techniques, the potential utility of trimming and subsampling in distinguishing between candidate distributions, which might not be feasible using the traditional goodness of fit method alone, when samples available are small. The performance of the proposed method is evaluated through its application to real and simulated yearly peak rainfall datasets. The proposed approach is then compared with several standard model selection techniques. Results show that the model selection techniques with the aid of subsampling are effective in identifying the true parent distribution for the untrimmed samples given a two-parameter distribution; contrarily, they are inefficient where a distribution with a three-parameter is the parent distribution. However, as trimming is introduced, all model selection methods recognise the true parent distribution for a three-parameter distribution. Overall, utilising trimming and subsampling with the aid of model selection methods yields promising outcomes in the analysis of hydrological extreme frequencies. Drawing from the results of numerical simulation and examination of observed data, the use of trimming and subsampling can be a viable tool in differentiating among candidate distributions used in the investigation of hydrological extreme frequencies.

Dimensions

R. Kidson & K. S. Richards, “Flood frequency analysis: assumptions and alternatives”, Progress in Physical Geography 29 (2005) 392-410. https://journals.sagepub.com/doi/abs/10.1191/0309133305pp454ra.

J. R. Wallis, Regional Frequency Analysis: An Approach Based on L-Moments, Cambridge University Press, New York, 1997. https://doi.org/10.1017/CBO9780511529443.

L. Yan, L. Xiong, G. Ruan, C. Y. Xu, P. Yan & P. Liu, “Reducing un certainty of design floods of two-component mixture distributions by utilizing flood timescale to classify flood types in seasonally snow covered region”, Journal of Hydrology 574 (2019) 588. https://www.sciencedirect.com/science/article/abs/pii/S0022169419303968.

L. Yan, L. Zhang, L. Xiong, P. Yan, C. Jiang, W. Xu, B. Xiong, K. Yu, Q. Ma & C. Y. Xu, “Flood Frequency Analysis Using Mixture Distributions in Light of Prior Flood Type Classification in Norway”, Remote Sensing 15 (2023) 401. https://doi.org/10.3390/rs15020401.

A. Flammini, J. Dari, C. Corradini, C. Saltalippi & R. Morbidelli, “Areal reduction factor estimate for extreme rainfall events”, In Rainfall, Elsevier, 2022, pp. 285–306. https://doi.org/10.1016/B978-0-12-822544-8.00014-7.

S. A. Moges & M. T. Taye, “Regional flood frequency curves for remote rural areas of the Nile River Basin: The case of Baro-Akobo drainage basin, Ethiopia”, In Extreme Hydrology and Climate Variability, Elsevier, 2019, pp. 385–393. https://www.sciencedirect.com/science/article/abs/pii/B9780128159989000300.

K. Hamed, & A. R. Rao, Flood frequency analysis, CRC press, 2019. https://doi.org/10.1201/9780429128813.

F. Laio, G. Di Baldassarre & A. Montanari, “Model selection techniques for the frequency analysis of hydrological extremes”, Water Resources Research 45 (2009) W07416. https://doi.org/10.1029/2007WR006666.

C. Cunnane, Statistical distribution for flood frequency analysis, WMO Operational Hydrology, Report No. 33, WMO-No. 718, Geneva, Switzerland, 1989. https://library.wmo.int/viewer/33760?medianame=Wmo_718_#page=1&viewer=picture&o=bookmarks&n=0&q=.

B. Onoz & M. Bayazit, “Best-fit distributions of largest available flood samples”, Journal of Hydrology 167 (1995) 195. https://doi.org/10.1016/0022-1694(94)02633-M.

P. K. Langat, L. Kumar & R. Koech, “Identification of the most suitable probability distribution models for maximum, minimum, and mean streamflow”, Water 11 (2019) 734. https://doi.org/10.3390/w11040734

S. Das, “An assessment of using subsampling method in selection of a flood frequency distribution”, Stochastic Environmental Research and Risk Assessment 31 (2017) 2033. https://link.springer.com/article/10.1007/s00477-016_13183.//link.springer.com/article/10.1007/s11069_013-0775-y.

X. Chen, Q. Shao, C. Y. Xu, J. Zhang, L. Zhang & C. Ye, “Comparative study on the selection criteria for fitting flood frequency distribution models with emphasis on upper-tail behavior”, Water 9 (2017) 320. https://doi.org/10.3390/w9050320.

M. Ul Hassan, O. Hayat & Z. Noreen, “Selecting the best probability distribution for at-site flood frequency analysis; a study of Torne River”, SN Applied Sciences 1 (2019) 1. https://link.springer.com/article/10.1007/s42452-019-1584-z.

G. Di Baldassarre, F. Laio & A. Montanari, “Design flood estimation using model selection criteria”, Physics and Chemistry of the Earth Parts ABC 34 (2009) 606. https://doi.org/10.1016/j.pce.2008.10.066.

K. Haddad & A. Rahman, “Selection of the best fit flood frequency distribution and parameter estimation procedure: a case study for Tasmania in Australia”, Stochastic Environmental Research and Risk Assessment 25 (2011) 415. https://link.springer.com/article/10.1007/s00477-010-0412-1.

A. S. Rahman, A. Rahman, M. A. Zaman, K. Haddad, A. Ahsan & M. Imteaz, “A study on selection of probability distributions for at-site flood frequency analysis in Australia”, Natural hazards 69 (2013) 1803. https://link.springer.com/article/10.1007/s11069-013-0775-y.

X. Zeng, D. Wang & J. Wu, “Evaluating the three methods of goodness of fit test for frequency analysis”, Journal of Risk Analysis and Crisis Response 5 (2015) 178. https://www.jracr.com/index.php/jracr/article/view/151.

A. O. Adeyemi, I. A. Adeleke & E. E. Akarawak, “Modeling Extreme Stochastic Variations using the Maximum Order Statistics of Convoluted Distributions”, Journal of the Nigerian Society of Physical Sciences 5 (2023) 994. https://doi.org/10.46481/jnsps.2023.994.

D. N. Politis, J. P. Romano & M. Wolf, Subsampling, Springer Science & Business Media, 2012. https://link.springer.com/book/10.1007/978-1-4612-1554-7.

S. Coles, J. Bawa, L. Trenner & P. Dorazio, An introduction to statistical modeling of extreme values, Springer, London, 2001, p. 208. https://link.springer.com/book/10.1007/978-1-4471-3675-0.

K. P. Bhattarai, “Partial L-moments for the analysis of censored flood Samples Utilisation des L-moments partiels pour l’analyse d’echantillons ´tronques de crues”, Hydrological sciences journal ´ 49 (2004) 855. https://doi.org/10.1623/hysj.49.5.855.55138.

W. Zucchini, “An introduction to model selection”, Journal of mathematical psychology 44 (2000) 41. https://www.sciencedirect.com/science/article/abs/pii/S0022249699912762.

H. Akaike, “Information theory and an extension of the maximum likelihood principle”, In Selected papers of hirotugu akaike, Springer, New York, NY, 1998, pp. 199–213. https://link.springer.com/chapter/10.1007/978-1-4612-1694-0_15.

G. Calenda, C. P. Mancini & E. Volpi, “Selection of the probabilistic model of extreme floods: The case of the River Tiber in Rome”, Journal of Hydrology 371 (2009) 1. https://doi.org/10.1016/j.jhydrol.2009.03.010.

K. P. Burnhan & D. R. Anderson, Model selection and multimodel inference, Springer, New York, 2002. https://link.springer.com/book/10.1007/b97636.

G. Schwarz, “Estimating the dimension of a model”, The annals of statistics 6 (1978) 461. https://www.jstor.org/stable/2958889.

K. P. Burnham & D. R. Anderson, “Multimodel inference: understanding AIC and BIC in model selection”, Sociological methods & research 33 (2004) 261. https://doi.org/10.1177/0049124104268644.

S. B. Sunday, N. S. Agog, P. Magdalene, A. Mubarak & G. K. Anyam, “Modeling extreme rainfall in Kaduna using the generalised extreme value distribution”, Science World Journal 15 (2020) 73. https://www.ajol.info/index.php/swj/article/view/203031.

A. Robson & D. Reed, Flood estimation handbook: statistical procedures for flood frequency estimation, Institute of Hydrology, 1999. https://www.ceh.ac.uk/sites/default/files/2021-11/Flood-Estimation-Handbook-3-Statistical-Procedures-For-Flood-Frequency-Estimation_Alice-Robson_Duncan-Reed.pdf.

F. Laio, Cramer–von Mises and Anderson-Darling goodness of fit tests for extreme value distributions with unknown parameters. Water Resources Research 40 (2004) W09308. https://agupubs.onlinelibrary.wiley.com/doi/full/10.1029/2004WR003204.

G. E. Box, G. M. Jenkins, G. C. Reinsel & G. M. Ljung, Time series analysis: forecasting and control, 5th Edition, John Wiley & Sons, 2015. https://www.wiley.com/en-us/Time+Series+Analysis%3A+_Forecasting+and+Control%2C+5th+Edition-p-9781118675021.

Annual maximum plots

Published

2024-09-02

How to Cite

Assessing model selection techniques for distributions use in hydrological extremes in the presence of trimming and subsampling. (2024). Journal of the Nigerian Society of Physical Sciences, 6(4), 2077. https://doi.org/10.46481/jnsps.2024.2077

Issue

Section

Mathematics & Statistics

How to Cite

Assessing model selection techniques for distributions use in hydrological extremes in the presence of trimming and subsampling. (2024). Journal of the Nigerian Society of Physical Sciences, 6(4), 2077. https://doi.org/10.46481/jnsps.2024.2077