Sains Malaysiana 48(12)(2019): 2831–2839

http://dx.doi.org/10.17576/jsm-2019-4812-24

 

A Relative Tolerance Relation of Rough Set in Incomplete Information

(Perhubungan Toleransi Relatif Set Kasar dalam Maklumat tak Lengkap)

 

RD ROHMAT SAEDUDIN1*, SHAHREEN KASIM2, HAIRULNIZAM MAHDIN2, MOHD FARHAN MD FUDZEE2, EDI SUTOYO1, IWAN TRI RIYADI YANTO3, ROHAYANTI HASSAN4

 

1School of Industrial Engineering, Telkom University, 40257, Bandung, West Java, Indonesia

 

2Faculty of Computer Science and Information Technology, Universiti Tun Hussein Onn Malaysia, 86400 Batu Pahat, Johor Darul Takzim, Malaysia

 

3Department of Information Systems, Universitas Ahmad Dahlan, 55161, Yogyakarta, Indonesia

 

4Faculty of Computing, Universiti Teknologi Malaysia, 81310 Skudai, Johor Darul Takzim, Malaysia

 

Diserahkan: 21 Februari 2019/Diterima: 25 Disember 2019

 

ABSTRACT

University is an educational institution that has objectives to increase student retention and also to make sure students graduate on time. Student learning performance can be predicted using data mining techniques e.g. the application of finding essential association rules on student learning base on demographic data by the university in order to achieve these objectives. However, the complete data i.e. the dataset without missing values to generate interesting rules for the detection system, is the key requirement for any mining technique. Furthermore, it is problematic to capture complete information from the nature of student data, due to high computational time to scan the datasets. To overcome these problems, this paper introduces a relative tolerance relation of rough set (RTRS). The novelty of RTRS is that, unlike previous rough set approaches that use tolerance relation, non-symmetric similarity relation, and limited tolerance relation, it is based on limited tolerance relation by taking account into consideration the relatively precision between two objects and therefore this is the first work that uses relatively precision. Moreover, this paper presents the mathematical properties of the RTRS approach and compares the performance and the existing approaches by using real-world student dataset for classifying university’s student performance. The results show that the proposed approach outperformed the existing approaches in terms of computational time and accuracy.

Keywords: Classification; educational data mining; incomplete information systems; rough set theory

 

ABSTRAK

Universiti adalah sebuah institusi pendidikan yang antara objektifnya adalah untuk meningkatkan penahanan pelajar dan juga untuk memastikan pelajar bergraduasi dalam jangka masa yang ditetapkan. Untuk mencapai objektif tersebut, pelajar perlulah memastikan prestasi pembelajaran sentiasa konsisten. Teknik perlombongan data boleh digunakan untuk meramal prestasi pembelajaran pelajar. Namun, isu data hilang atau data tidak lengkap membataskan keberkesanan teknik perlombongan data khasnya dalam mengenal pasti hubungan atribut pembelajaran pelajar dan atribut demografi pelajar. Isu menjadi lebih sukar apabila melibatkan data pelajar yang banyak. Maka, kertas ini mencadangkan teknik perhubungan toleransi relatif set kasar (RTRS) bagi mengatasi isu ini. Kelainan RTRS dalam kertas ini adalah dengan menggunakan ketepatan relatif antara dua objek atribut. Selain itu, kertas ini turut membentangkan formula matematik yang digunakan dalam RTRS. Seterusnya, prestasi cadangan teknik RTRS ini dibandingkan dengan teknik asal menggunakan set data pelajar universiti untuk mengelaskan prestasi pelajar tersebut. Hasil menunjukkan bahawa teknik RTRS yang dicadangkan mengatasi teknik sedia ada daripada segi masa komputer dan ketepatan.

Kata kunci: Pengelasan; perlombongan data pendidikan; sistem maklumat tidak lengkap; teori set kasar

RUJUKAN

Borkar, S. & Rajeswari, K. 2013. Predicting students academic performance using education data mining. IJCSMC International Journal of Computer Science and Mobile Computing 2(7): 273-279.

Bunting, B.P., Adamson, G. & Mulhall, P.K. 2002. A Monte Carlo examination of an MTMM model with planned incomplete data structures. Structural Equation Modeling 9(3): 369-389.

Chiroma, H., Abdulkareem, S., Muaz, S.A., Abubakar, A.I., Sutoyo, E., Mungad, M., Younes, Saadi., Eka, Novita, Sari. & Herawan, T. 2015. An intelligent modeling of oil consumption. Advances in Intelligent Systems and Computing 320: 557-568.

Chmielewski, M.R., Grzymala-Busse, J.W., Peterson, N.W. & Than, S. 1993. The rule induction system LERS-a version for personal computers. Foundations of Computing and Decision Sciences 18(3-4): 181-212.

Dobrota, M., Bulajić, M. & Radojičić, Z. 2014. Data mining models for prediction of customers’ satisfaction: The CART analysis. In Innovative Management and Firm Performance, edited by Jakšić, M.L., Rakočević, S.B. & Martić, M. London: Palgrave Macmillan. pp. 401-421.

Fayyad, U.M. 1996. Data mining and knowledge discovery: Making sense out of data. IEEE Expert: Intelligent Systems and Their Applications 11(5): 20-25.

Ibrahim, Z. & Rusli, D. 2007. Predicting students’ academic performance: Comparing artificial neural network, decision tree and linear regression. 21st Annual SAS Malaysia Forum, 5th September.

Kotsiantis, S., Pierrakeas, C. & Pintelas, P. 2004. Predicting students’performance in distance learning using machine learning techniques. Applied Artificial Intelligence 18(5): 411-426.

Kryszkiewicz, M. 1999. Rules in incomplete information systems. Information Sciences 113(3): 271-292.

Kryszkiewicz, M. 1998. Rough set approach to incomplete information systems. Information Sciences 112(1): 39-49.

Márquez-Vera, C., Cano, A., Romero, C. & Ventura, S. 2013. Predicting student failure at school using genetic programming and different data mining approaches with high dimensional and imbalanced data. Applied Intelligence 38(3): 315-330.

Minaei-Bidgoli, B., Kashy, D.A., Kortemeyer, G. & Punch, W.F. 2003. Predicting student performance: An application of data mining methods with an educational web-based system. Proceedings-Frontiers in Education Conference 2003 1: 13-18.

Mohammed, M.A.T., Mohd, W.M.W., Arshah, R.A., Mungad, M., Sutoyo, E. & Chiroma, H. 2016. Analysis of parameterization value reduction of soft sets and its algorithm. International Journal of Software Engineering and Computer Systems 2(1): 51-57.

Ogunde, A.O. & Ajibade, D.A. 2014. A data mining system for predicting university students’ graduation grades using ID3 decision tree algorithm. Journal of Computer Science and Information Technology 2(1): 21-46.

Pal, S. 2012. Mining educational data to reduce dropout rates of engineering students. International Journal of Information Engineering and Electronic Business 4(2): 1. Doi: 10.5815/ ijieeb.2012.02.01.

Romero, C. & Ventura, S. 2007. Educational data mining: A survey from 1995 to 2005. Expert Systems with Applications 33(1): 135-146.

Saedudin, R.R., Kasim, S., Mahdin, H., Sutoyo, E., Riyadi Yanto, I.T., Hassan, R. & Ismail, M.A. 2018. A relative tolerance relation of rough set (RTRS) for potential fish yields in Indonesia. Journal of Coastal Research: Special Issue 82 - Coastal Ecosystem Responses to Human and Climatic Changes throughout Asia. pp. 84-92.

Saedudin, R.R., Sutoyo, E., Kasim, S., Mahdin, H. & Yanto, I.T.R. 2017a. A comparative analysis of rough sets for incomplete information system in student dataset. International Journal on Advanced Science, Engineering and Information Technology 7(6): 2078-2084.

Saedudin, R.R., Sutoyo, E., Kasim, S., Mahdin, H. & Yanto, I.T.R. 2017b. Attribute selection on student performance dataset using maximum dependency attribute. Electrical, Electronics and Information Engineering (ICEEIE), 2017 5th International Conference. pp. 176-179.

Saedudin, R.R., Kasim, S.B., Mahdin, H. & Hasibuan, M.A. 2016. Soft set approach for clustering graduated dataset. International Conference on Soft Computing and Data Mining. pp. 631-637.

Slavin, R.E., Karweit, N.L. & Wasik, B.A. 1994. Preventing Early School Failure: Research, Policy, and Practice. Boston: Allyn & Bacon.

Stefanowski, J. & Tsoukias, A. 2001. Incomplete information tables and rough classification. Computational Intelligence 17(3): 545-566.

Stefanowski, J. & Tsoukiàs, A. 1999. On the extension of rough sets under incomplete information. International Workshop on Rough Sets, Fuzzy Sets, Data Mining, and Granular-Soft Computing. pp. 73-81.

Sutoyo E., Yanto, I.T.R., Saadi, Y., Chiroma, H., Hamid, S. & Herawan, T. 2019. A framework for clustering of web users transaction based on soft set theory. In Proceedings of the International Conference on Data Engineering 2015 (DaEng-2015). Lecture Notes in Electrical Engineering, edited by Abawajy, J., Othman, M., Ghazali, R., Deris, M., Mahdin H. & Herawan T. Singapore: Springer. 520: 307-314.

Sutoyo, E., Yanto, I.T.R., Saedudin, R.R. & Herawan, T. 2017a. A soft set-based co-occurrence for clustering web user transactions. Telkomnika (Telecommunication Computing Electronics and Control) 15(3): 1344-1353.

Sutoyo, E., Saedudin, R.R., Yanto, I.T.R. & Apriani, A. 2017b. Application of adaptive neuro-fuzzy inference system and chicken swarm optimization for classifying river water quality. Electrical, Electronics and Information Engineering (ICEEIE), 2017 5th International Conference. pp. 118-122.

Van Nguyen, D., Yamada, K. & Unehara, M. 2013. Extended tolerance relation to define a new rough set model in incomplete information systems. Advances in Fuzzy Systems 2013: 37209.

Wang, G. 2002. Extension of rough set under incomplete information systems. Proceedings of the 2002 IEEE International Conference 2: 1098-1103.

Wu, Y. & Guo, Q. 2010. An extension model of rough set in incomplete information system. Future Computer and Communication (ICFCC), 2010 2nd International Conference 2: 434-438.

Yadav, S.K., Bharadwaj, B. & Pal, S. 2012. Mining education data to predict student’s retention: A comparative study. International Journal of Computer Science and Information Security 10(2): 113-117.

Yadav, S.K. & Pal, S. 2012. Data mining: A prediction for performance improvement of engineering students using classification. World of Computer Science and Information Technology Journal WCSIT 2(2): 51-56.

Yang, X. 2009. An improved model of rough sets on incomplete information systems. Management of e-Commerce and e-Government, 2009. ICMECG’09. International Conference. pp. 193-196.

Yang, X., Song, X. & Hu, X. 2011. Generalisation of rough set for rule induction in incomplete system. International Journal of Granular Computing, Rough Sets and Intelligent Systems 2(1): 37-50.

Yanto, I.T.R., Saedudin, R.R., Hartama, D. & Herawan, T. 2016. Clustering based on classification quality (CCQ). International Conference on Soft Computing and Data Mining. pp. 327-335.

Yanto, I.T.R., Saedudin, R.R., Lashari, S.A. & Haviluddin. 2018a. A numerical classification technique based on fuzzy soft set using hamming distance. International Conference on Soft Computing and Data Mining. pp. 252-260.

Yanto, I.T.R., Sutoyo, E., Apriani, A. & Verdiansyah, O. 2018b. Fuzzy soft set for rock igneous clasification. 2018 International Symposium on Advanced Intelligent Informatics (SAIN). pp. 199-203.

Zhou, J. & Yang, X. 2012. Rough set model based on hybrid tolerance relation. International Conference on Rough Sets and Knowledge Technology. pp. 28-33.

Zhou, Q. 2010. Research on tolerance-based rough set models. System Science, Engineering Design and Manufacturing Informatization (ICSEM), 2010 International Conference 2: 137-139.

 

*Pengarang untuk surat-menyurat; email: rdrohmat@telkomuniversity.ac.id

 

 

 

 

 

sebelumnya