Sains Malaysiana 51(8)(2022): 2655-2668

http://doi.org/10.17576/jsm-2022-5108-24

 

Hydroclimatic Data Prediction using a New Ensemble Group Method of Data Handling Coupled with Artificial Bee Colony Algorithm

(Ramalan Data Hidroklimatik menggunakan Kaedah Pengendalian Data Kumpulan Ensembel Baharu Digandingkan dengan Algoritma Koloni Lebah Buatan)

 

BASRI BADYALINA1,*, NURKHAIRANY AMYRA MOKHTAR1, NUR AMALINA MAT JAN2, MUHAMMAD FADHIL MARSANI3, MOHAMAD FAIZAL RAMLI4, MUHAMMAD MAJID4 & FATIN FARAZH YA'ACOB4

 

1Faculty of Computer and Mathematical Sciences, Universiti Teknologi MARA, Cawangan Johor, Kampus Segamat, 85000 Segamat, Johor Darul Takzim, Malaysia

2Department of Physical and Mathematical Science, Faculty of Science, Universiti Tunku Abdul Rahman, Kampar Campus, Jalan Universiti, Bandar Barat, 31900 Kampar, Perak Darul Ridzuan, Malaysia

3School of Mathematical Sciences, Universiti Sains Malaysia, 11800 Minden, Penang, Malaysia

4Universiti Teknologi MARA, Cawangan Johor, Kampus Segamat, 8500 Segamat, Johor Darul Takzim, Malaysia

 

Diserahkan: 22 Ogos 2021/Diterima: 22 Februari 2022

 

Abstract

Linear regression is widely used in flood quantile study that consists of meteorological and physiographical variables. However, linear regression does not capture the complex nonlinear relationship between predictor and target variables. It is rare to find a hydrological application using the group method of data handling (GMDH) model, artificial bee colony (ABC) algorithm, and ensemble technique, precisely predicting ungauged sites. GMDH model is known to be an effective model in complying with a nonlinear relationship. Therefore, in this paper, we enhance the GMDH model by implementing the ABC algorithm to optimize the parameter of partial description GMDH model with some transfer functions, namely polynomial, radial basis, sigmoid and hyperbolic tangent function. Then, ensemble averaging combines the output from those various transfer functions and becomes the new ensemble GMDH model coupled with the ABC algorithm (EGMDH-ABC) model. The results show that this method significantly improves the prediction performance of the GMDH model. The EGMDH-ABC model satisfies the nonlinearity in data to produce a better estimation. Also, it provides more robust, accurate, and efficient results.

 

Keywords: ABC algorithm; GEV distribution; GMDH modele; Peninsular Malaysia; ungauged site

 

Abstrak

Regresi linear digunakan secara meluas dalam kajian kuantiti banjir yang terdiri daripada pemboleh ubah meteorologi dan fisiografi. Walau bagaimanapun, regresi linear tidak mengenal pasti hubungan tidak linear yang kompleks antara pemboleh ubah peramal dan sasaran. Sukar untuk menemui aplikasi hidrologi yang menggunakan kaedah kumpulan model pengendalian data (GMDH), algoritma koloni lebah tiruan (ABC) dan teknik penggabungan, khususnya dalam meramalkan kuantil banjir di kawasan tiada data. Model GMDH dikenali sebagai model yang berkesan dalam mengenal pasti hubungan tidak linear. Oleh itu, dalam kajian ini, kami menambah baik model GMDH dengan menerapkan algoritma ABC untuk mengoptimumkan parameter penerangan separa model GMDH dengan beberapa fungsi pemindahan iaitu fungsi polinomial, asas radial, sigmoid dan tangen hiperbolik. Kemudian, penggabungan secara purata digunakan untuk menggabungkan hasil daripada pelbagai fungsi pemindahan tersebut dan membangunkan model baru iaitu EGMDH-ABC. Hasil kajian menunjukkan bahawa kaedah ini meningkatkan prestasi ramalan model GMDH dengan ketara. Model EGMDH-ABC berjaya mengenal pasti ketidaklinearan di dalam data untuk menghasilkan anggaran yang lebih baik. Di samping itu, hasil keputusan yang lebih mantap, tepat dan cekap dapat dihasilkan.

 

Kata kunci: Algoritma ABC; lembangan tiada data; model GMDH; Semenanjung Malaysia; taburan GEV

 

RUJUKAN

Adnan, R.M., Liang, Z., Parmar, K.S., Soni, K. & Kisi, O. 2021. Modeling monthly streamflow in mountainous basin by MARS, GMDH-NN and DENFIS using hydroclimatic data.  Neural Computing and Applications 33(7): 2853-2871.

Ahmadi, A., Nasseri, M. & Solomatine, D.P. 2019. Parametric uncertainty assessment of hydrological models: Coupling UNEEC-P and a fuzzy general regression neural network. Hydrological Sciences Journal 64(9): 1080-1094.

Ahmadi, M.H., Ahmadi, M-A., Mehrpooya, M. & Rosen, M.A. 2015. Using GMDH neural networks to model the power and torque of a stirling engine. Sustainability 7(2): 2243-2255.

Alobaidi, M.H., Ouarda, T.B.M.J., Marpu, P.R. & Chebana, F. 2021. Diversity-driven ANN-based ensemble framework for seasonal low-flow analysis at ungauged sites.  Advances in Water Resources 147: 103814.

Amiri, M. & Soleimani, S. 2021. ML-based group method of data handling: An improvement on the conventional GMDH. Complex & Intelligent Systems 7: 2949-2960.

Ashrafzadeh, A., Kişi, O., Aghelpour, P., Biazar, S.M. & Masouleh, M.A. 2020. Comparative study of time series models, support vector machines, and GMDH in forecasting long-term evapotranspiration rates in northern Iran. Journal of Irrigation and Drainage Engineering 146(6): 04020010.

Aslan, S. 2019. A transition control mechanism for artificial bee colony (ABC) algorithm.  Computational Intelligence and Neuroscience 2019: Article ID. 5012313.

Ayoub, M.A., Elhadi, A., Fatherlhman, D., Saleh, M.O., Alakbari, F.S. & Mohyaldinn, M.E. 2022. A new correlation for accurate prediction of oil formation volume factor at the bubble point pressure using group method of data handling approach. Journal of Petroleum Science and Engineering 208: 109410.

Aziz, K., Haque, M.M., Rahman, A., Shamseldin, A.Y. & Shoaib, M. 2017. Flood estimation in ungauged catchments: Application of artificial intelligence based methods for Eastern Australia. Stochastic Environmental Research and Risk Assessment 31(6): 1499-1514.

Badem, H., Basturk, A., Caliskan, A. & Yuksel, M.E. 2017. A new efficient training strategy for deep neural networks by hybridization of artificial bee colony and limited–memory BFGS optimization algorithms. Neurocomputing 266: 506-526.

Badyalina, B. & Shabri, A. 2015. Flood estimation at ungauged sites using group method of data handling in Peninsular Malaysia. Jurnal Teknologi 76(1). https://doi.org/10.11113/jt.v76.2640

Badyalina, B., Mokhtar, N.A., Mat Jan, N.A., Hassim, N.H. & Yusop, H. 2021a. Flood frequency analysis using L-moment for Segamat River. MATEMATIKA: Malaysian Journal of Industrial and Applied Mathematics 37(2): 47-62.

Badyalina, B., Shabri, A. & Marsani, M.F. 2021b. Streamflow estimation at ungauged basin using modified group method of data handling. Sains Malaysiana 50(9): 2765-2779.

Broderick, C., Matthews, T., Wilby, R.L., Bastola, S. & Murphy, C. 2016. Transferability of hydrological models and ensemble averaging methods between contrasting climatic periods. Water Resources Research 52(10): 8343-8373.

Campos, J.A. & Pedrollo, O.C. 2021. A regional ANN-based model to estimate suspended sediment concentrations in ungauged heterogeneous basins. Hydrological Sciences Journal 66(7): 1222-1232.

Cannon, A.J. 2010. A flexible nonlinear modelling framework for nonstationary generalized extreme value analysis in hydroclimatology. Hydrological Processes: An International Journal 24(6): 673-685.

Criss, R.E. & Winston, W.E. 2008. Do Nash values have value? Discussion and alternate proposals.  Hydrological Processes: An International Journal 22(14): 2723-2725.

De Paola, F., Giugni, M., Pugliese, F., Annis, A. & Nardi, F. 2018. GEV parameter estimation and stationary vs. non-stationary analysis of extreme rainfall in African test cities.  Hydrology 5(2): 28.

Desai, S. & Ouarda, T.B.M.J. 2021. Regional hydrological frequency analysis at ungauged sites with random forest regression. Journal of Hydrology 594: 125861.

Elbaz, K., Shen, S-L., Zhou, A., Yin, Z-Y. & Lyu, H-M. 2021. Prediction of disc cutter life during shield tunneling with AI via the incorporation of a genetic algorithm into a GMDH-type neural network. Engineering 7(2): 238-251.

Fillipova, V., Leedal, D. & Hammond, A. 2020. Regional Flood Frequency Estimation for the Contiguous USA using Artificial Neural Networks. EGU General Assembly Conference Abstracts.

Goyal, H.R., Ghanshala, K.K. & Sharma, S. 2021. Post flood management system based on smart IoT devices using AI approach.  Materials Today: Proceedings.

Guru, N. & Jha, R. 2014. A study on selection of probability distributions for at-site flood frequency analysis in Mahanadi River Basin, India. http://dx.doi.org/10.1201/b17133-241

Hecht-Nielsen, R. 1990. Neurocomputing.Boston: Addison-Wesley. pp. 89-93.

Hosking, J.R.M. & Wallis, J.R. 1997. Regional Frequency Analysis: An Approach Based on L-moments. Cambrige: Cambrige University Press. http://dx.doi.org/10.1017/cbo9780511529443

Hosseini, S.A., Taheri, B., Abyaneh, H.A. & Razavi, F. 2021. Comprehensive power swing detection by current signal modeling and prediction using the GMDH method. Protection and Control of Modern Power Systems 6(1): 1-11.

Ivakhnenko, A.G. 1971. Polynomial theory of complex systems. IEEE Transactions on Systems, Man, and Cybernetics 4: 364-378.

Ivakhnenko, A.G. 1970. Heuristic self-organization in problems of engineering cybernetics.  Automatica 6(2): 207-219.

Jolánkai, Z. & Koncsos, L. 2018. Base flow index estimation on gauged and ungauged catchments in Hungary using digital filter, multiple linear regression and artificial neural networks. Periodica Polytechnica Civil Engineering 62(2): 363-372.

Karaboga, D. & Akay, B. 2009. A comparative study of artificial bee colony algorithm. Applied Mathematics and Computation 214(1): 108-132.

Karaboga, D. & Basturk, B. 2007. A powerful and efficient algorithm for numerical function optimization: Artificial bee colony (ABC) algorithm. Journal of Global Optimization 39(3): 459-471.

Kardani, N., Bardhan, A., Kim, D., Samui, P. & Zhou, A. 2021. Modelling the energy performance of residential buildings using advanced computational frameworks based on RVM, GMDH, ANFIS-BBO and ANFIS-IPSO. Journal of Building Engineering 35: 102105.

Khan, M.S.R., Hussain, Z. & Ahmad, I. 2021. Regional flood frequency analysis, using l-moments, artificial neural networks and OLS regression, of various sites of Khyber-Pakhtunkhwa, Pakistan. Applied Ecology and Environmental Research 19(1): 471-489.

Kordrostami, S., Alim, M.A., Karim, F. & Rahman, A. 2020. Regional flood frequency analysis using an artificial neural network model. Geosciences 10(4): 127.

Le, L.T., Nguyen, H., Dou, J. & Zhou, J. 2019. A comparative study of PSO-ANN, GA-ANN, ICA-ANN, and ABC-ANN in estimating the heating load of buildings' energy efficiency for smart city planning. Applied Sciences 9(13): 2630.

Lee, W.H., Choi, H.S., Lee, D. & Choi, B. 2021. Stream flow generation for simulating yearly bed change at an ungauged stream in monsoon region. Water 13(4): 554.

Lu, R., Hu, H., Xi, M., Gao, H. & Pun, C-M. 2019. An improved artificial bee colony algorithm with fast strategy, and its application. Computers & Electrical Engineering 78: 79-88.

Mamun, A.A., Hashim, A. & Amir, Z. 2012. Regional statistical models for the estimation of flood peak values at ungauged catchments: Peninsular Malaysia. Journal of Hydrologic Engineering 17(4): 547-553. doi: doi:10.1061/(ASCE)HE.1943-5584.0000464.

Maofa Wang, Mohammad Rezaie-Balf, Sujay Raghavendra Naganna & Zaher Mundher Yaseen. 2021. Sourcing CHIRPS precipitation data for streamflow forecasting using intrinsic time-scale decomposition based machine learning models. Hydrological Sciences Journal 66(9): 1437-1456.

Mat Jan, N.A., Shabri, A., Hounkpè, J. & Badyalina, B. 2018. Modelling non-stationary extreme streamflow in Peninsular Malaysia. International Journal of Water 12(2): 116-140.

Mat Jan, N.A., Shabri, A., Ismail, S., Badyalina, B., Abadan, S.S. & Yusof, N. 2016a. Three-parameter lognormal distribution: Parametric estimation using L-moment and TL-moment approach. Jurnal Teknologi 78: 6-11.

Mat Jan, N.A., Shabri, A. & Badyalina, B. 2016b. Selecting probability distribution for regions of Peninsular Malaysia streamflow. AIP Conference Proceedings. 1750: 060014.

McCuen, R.H., Knight, Z. & Cutter, A.G. 2006. Evaluation of the Nash-Sutcliffe Efficiency Index. Journal of Hydrologic Engineering 11(6): DOI:10.1061/(ASCE)1084-0699(2006)11:6(597).

Meresa, H. 2019. Modelling of river flow in ungauged catchment using remote sensing data: Application of the empirical (SCS-CN), artificial neural network (ANN) and hydrological model (HEC-HMS). Modeling Earth Systems and Environment 5(1): 257-273.

Mokhtar, N.A., Zubairi, Y.Z., Hussin, A.G., Badyalina, B., Ghazali, A.F., Ya’acob, F.F. & Kerk, L.C. 2021. Modelling wind direction data of Langkawi Island during Southwest monsoon in 2019 to 2020 using bivariate linear functional relationship model with von Mises distribution. Journal of Physics: Conference Series 1988(1): 012097.

Nariman Valizadeh, Majid Mirzaei, Mohammed Falah Allawi, Haitham Abdulmohsin Afan, Nuruol Syuhadaa Mohd, Aini Hussain, & Ahmed El-Shafie. 2017. Artificial intelligence and geo-statistical models for streamflow forecasting in ungauged stations: State of the art. Natural Hazards 86(3): 1377-1392.

Otiniano, C.E.G., De Paiva, B.S. & Neto, D.S.B. 2019. The transmuted GEV distribution: Properties and application.  Communications for Statistical Applications and Methods 26(3): 239-259.

Pandey, G.R. & Nguyen, V-T-V. 1999. A comparative study of regression based methods in regional flood frequency analysis. Journal of Hydrology 225(1-2): 92-101.

Shu, C. & Burn, D.H. 2004. Artificial neural network ensembles and their application in pooled flood frequency analysis. Water Resources Research 40(9). https://doi.org/10.1029/2003WR002816

Shu, C. & Ouarda, T.B.M.J. 2008. Regional flood frequency analysis at ungauged sites using the adaptive neuro-fuzzy inference system. J. Hydrol. 349(1-2): 31-43. doi:10.1016/j.jhydrol.2007.10.050.

Shu, C. & Ouarda, T.B.M.J. 2007. Flood frequency analysis at ungauged sites using artificial neural networks in canonical correlation analysis physiographic space. Water Resources Research 43: doi: 10.1029/2006WR005142.

Sivakumar, B. & Singh, V.P. 2012. Hydrologic system complexity and nonlinear dynamic concepts for a catchment classification framework. Hydrology and Earth System Sciences 16(11): 4119-4131.

Solanki, P., Baldaniya, D., Jogani, D., Chaudhary, B., Shah, M. & Kshirsagar, A. 2021. Artificial intelligence: New age of transformation in petroleum upstream. Petroleum Research 7(1): 106-114.

Tan, A., Zhou, G. & He, M. 2021. Surface defect identification of Citrus based on KF-2D-Renyi and ABC-SVM.  Multimedia Tools and Applications 80(6): 9109-9136.

Tang, Z. & Fishwick, P.A. 1993. Feedforward neural nets as models for time series forecasting. ORSA Journal on Computing 5(4): 374-385.

Tegegne, G., Kim, Y‐O. & Lee, J‐K. 2019. Spatiotemporal reliability ensemble averaging of multimodel simulations. Geophysical Research Letters 46(21): 12321-12330.

Tereshko, V. & Lee, T. 2002. How information-mapping patterns determine foraging behaviour of a honey bee colony. Open Systems & Information Dynamics 9(2): 181-193.

Wan Zawiah Wan Zin, Abdul Aziz Jemain, Kamarulzaman Ibrahim, Jamaludin Suhaila & Mohd Deni Sayang. 2009. A comparative study of extreme rainfall in Peninsular Malaysia: With reference to partial duration and annual extreme series. Sains Malaysiana 38(5): 751-760.

Wong, F.S. 1991. Time series forecasting using backpropagation neural networks.  Neurocomputing 2(4): 147-159.

Wu, J., Wang, Y., Zhang, X. & Chen, Z. 2016. A novel state of health estimation method of Li-ion battery using group method of data handling. Journal of Power Sources 327: 457-464.

Xiang, W-L. & An, M-Q. 2013. An efficient and robust artificial bee colony algorithm for numerical optimization. Computers & Operations Research 40(5): 1256-1265.

Xiao, Y., Wu, J., Lin, Z. & Zhao, X. 2018. A deep learning-based multi-model ensemble method for cancer prediction. Computer Methods and Programs in Biomedicine 153: 1-9.

Yang, S., Yang, D., Chen, J., Santisirisomboon, J., Lu, W. & Zhao, B. 2020. A physical process and machine learning combined hydrological model for daily streamflow simulations of large watersheds with limited observation data. Journal of Hydrology 590: 125206.

Yin, H., Guo, Z., Zhang, X., Chen, J. & Zhang, Y. 2021. Runoff predictions in ungauged basins using sequence-to-sequence models. Journal of Hydrology 603: 126975.

Yurtkuran, A. & Emel, E. 2016. A discrete artificial bee colony algorithm for single machine scheduling problems. International Journal of Production Research 54(22): 6860-6878.

 

*Pengarang untuk surat-menyurat; email: basribdy@uitm.edu.my

 

   

sebelumnya