Sains Malaysiana 50(8)(2021): 2479-2497

http://doi.org/10.17576/jsm-2021-5008-28

 

Prediction of COVID-19 Patient using Supervised Machine Learning Algorithm

(Ramalan Pesakit COVID-19 menggunakan Algoritma Pembelajaran Diselia Mesin)

 

BUVANA, M.1* & MUTHUMAYIL, K.2

 

1Department of Computer Science and Engineering, PSNA College of Engineering and Technology, Tamil Nadu, India

 

2Department of Information Technology, PSNA College of Engineering and Technology, Tamil Nadu, India

 

Received: 7 April 2021/Accepted: 27 June 2021

 

ABSTRACT

One of the most symptomatic diseases is COVID-19. Early and precise physiological measurement-based prediction of breathing will minimize the risk of COVID-19 by a reasonable distance from anyone; wearing a mask, cleanliness, medication, balanced diet, and if not well stay safe at home. To evaluate the collected datasets of COVID-19 prediction, five machine learning classifiers were used: Nave Bayes, Support Vector Machine (SVM), Logistic Regression, K-Nearest Neighbour (KNN), and Decision Tree. COVID-19 datasets from the Repository were combined and re-examined to remove incomplete entries, and a total of 2500 cases were utilized in this study. Features of fever, body pain, runny nose, difficulty in breathing, shore throat, and nasal congestion, are considered to be the most important differences between patients who have COVID-19s and those who do not. We exhibit the prediction functionality of five machine learning classifiers. A publicly available data set was used to train and assess the model. With an overall accuracy of 99.88 percent, the ensemble model is performed commendably. When compared to the existing methods and studies, the proposed model is performed better. As a result, the model presented is trustworthy and can be used to screen COVID-19 patients timely, efficiently.

Keywords: Classifier; COVID-19; machine learning; prediction; supervised learning

 

ABSTRAK

Salah satu penyakit yang paling simptomatik ialah COVID-19. Ramalan pernafasan berdasarkan pengukuran fisiologi awal dan tepat akan meminimumkan risikoCOVID-19 dengan jarak yang munasabah daripada sesiapa sahaja; memakai topeng, kebersihan, ubat-ubatan, diet seimbang dan jika tidak sihat, tinggal di rumah. Untuk menilai kumpulan data ramalanCOVID-19 yang dikumpulkan, lima pengkelasan pembelajaran mesin digunakan: Nave Bayes, Mesin Vektor Sokongan (SVM), Regresi Logistik, Jiran K-Terdekat (KNN) dan Pohon Keputusan. Set data COVID-19 daripada Repositori digabungkan dan disemak semula untuk menghapus entri yang tidak lengkap dan sejumlah 2500 kes digunakan dalam kajian ini. Ciri demam, sakit badan, hidung berair, kesukaran bernafas, sakit tekak dan hidung tersumbat, dianggap sebagai perbezaan yang paling penting antara pesakit yang menghidapCOVID-19 dan mereka yang tidak. Kami menunjukkan fungsi ramalan lima pengelasan pembelajaran mesin. Satu set data yang tersedia untuk umum digunakan untuk melatih dan menilai model. Dengan ketepatan keseluruhan 99.88 peratus, model ensembel dilakukan dengan terpuji. Jika dibandingkan dengan kaedah dan kajian yang ada, model yang dicadangkan dilakukan dengan lebih baik. Hasilnya, model yang dipersembahkan boleh dipercayai dan dapat digunakan untuk menyaring pesakit COVID-19 tepat pada waktunya.

Kata kunci: COVID-19; pembelajaran mesin; pembelajaran yang diselia; pengelas; ramalan

 

REFERENCES

Ayyoubzadeh, S.M., Ayyoubzadeh, S.M., Zahedi, H., Ahmadi, M. & R Niakan Kalhori, S. 2020. Predicting COVID-19 incidence through analysis of google trends data in Iran: Data mining and deep learning pilot study. JMIR Public Health and Surveillance 6(2): e18828. doi.org/10.2196/18828.

COVID-19. Dataset. https://github.com/Simranpandey16/Covid-19-prediction.

COVID-19 Public Health Emergency of International Concern (PHEIC). Global research and innovation forum. https://www.who.int/publications/m/item/covid-19-public-health-emergency-of-international-concern-(pheic)-global-research-and-innovation-forum.

Dharshana Deepthi, L., Shanthi, D. & Buvana, M. 2020. An intelligent Alzheimer’s Disease prediction using convolutional neural network (CNN). International Journal of Advanced Research in Engineering and Technology (IJARET) 11(4): 12-22.

ExtraTreeClassifier. https://www.geeksforgeeks.org/ml-extra-tree-classifier-for-feature-selection/.

Furqan Rustam, Aijaz Ahmad Reshi, Arif Mehmood, Saleem Ullah, Byung-Won On, Waqar Aslam & Gyu Sang Choi. 2020. COVID-19 future forecasting using supervised machine learning models. IEEE Access 8: 101489-101499.

Jackins, V., Vimal, S., Kaliappan, M. & Lee, M.Y. 2021. AI-based smart prediction of clinical disease using random forest classifier and Naive Bayes. The Journal of Supercomputing 77: 5198-5219. https://doi.org/10.1007/s11227-020-03481.

Li, W.T., Ma, J-Y., Neil, S., Grant, C., Jaideep, C., Tsai, J., Apostol, L., Honda, C., Xu, J-Y., Wong, L., Zhang, T-Y., Lee, A., Gnanasekar, A., Honda, T., Kuo, S., Yu, M., Chang, E., Rajasekaran, M.R. & Ongkeko, W. 2020. Using machine learning of clinical data to diagnose COVID-19: A systematic review and meta-analysis. BMC Medical Informatics and Decision Making 20: 247. DOI. 10.1186/s12911-020-01266-z.

Muhammad, L.J., Algehyne, E.A., Usman, S.S., Ahmad, A., Chakraborty, C. & Mohammed, I.A. 2021. Supervised machine learning models for prediction of COVID-19 infection using epidemiology dataset. SN Comput. Sci. 2(1): 11. https://doi.org/10.1007/s42979-020-00394-7.

Naw Safrin Sattar, Shaikh Arifuzzaman, Minhaz F. Zibran & Md Mohiuddin Sakib. 2019. Detecting web spam in webgraphs with predictive model analysis. 2019 IEEE International Conference on Big Data (Big Data). pp. 4299-4308. doi: 10.1109/BigData47090.2019.9006282.

Remuzzi, A. & Remuzzi, G. 2020. COVID‐19, and Italy: What next? The Lancet 395(10231): 1225-1228.

Roosa, K., Lee, Y., Luo, R., Kirpich, A., Rothenberg, R., Hyman, J.M., Yan, P. & Chowell, G. 2020. Real‐time forecasts of the COVID‐19 epidemic in China from February 5th to February 24th. Infectious Disease Modelling 5: 256-263.

Sarwar, A., Ali, M., Manhas, J. & Sharma, V. 2020. Diagnosis of diabetes type-II using hybrid machine learning based ensemble model. Int. J. Inf. Tecnol. 12: 419-428.

Sharma, A., Tiwari, S., Deb, M.K. & Marty, J.L. 2020. Severe acute respiratory syndrome coronavirus-2 (SARS-CoV-2): A global pandemic and treatment strategies. International Journal of Antimicrobial Agents 56(2): 106054. https://doi.org/10.1016/j.ijantimicag.2020.106054.

Ud Din Khanday, A.M., Rabani, S.T., Khan, Q.R., Rouf, N. & Ud Din, M.M.  2020. Machine learning based approaches for detecting COVID-19 using clinical text data. Int. J. Inf. Tecnol. 12: 731-739  https://doi.org/10.1007/s41870-020-00495-9.

Wang, S., Kang, B., Ma, J., Zeng, X., Xiao, M., Guo, J., Cai, M.J., Yang, J.Y., Li, Y.D., Meng, X.F. & Bo, Xu. 2020. A deep learning algorithm using CT images to screen for Corona Virus Disease (COVID-19). Eur. Radiol. 31(8): 6096-6104.

Yan, L., Zhang, H-T., Goncalves, J., Xiao, Y., Wang, M-L., Guo, Y-Q., Sun, C., Tang, X-C., Jin, L., Zhang, M-Y., Huang, X., Xiao, Y., Cao, H., Chen, Y-Y., Ren, T-X., Wang, F., Xiao, Y., Huang, S., Tan, X., Huang, N-N., Jiao, B., Zhang, Y., Luo, A-L., Mombaerts, L., Jin, J-Y., Cao, Z-G., Li, S.S., Xu, H. & Yuan, Y. 2020. A machine learning-based model for survival prediction in patients with severe COVID-19 infection. medRxiv https:// doi.org/10.1101/2020.02.27.20028 027.

 

*Corresponding author; email: buvana@psnacet.edu.in

 

   

             

previous