Perbandingan Metode Cost Sensitive pada Decision Tree dan Naïve Bayes untuk Klasifikasi Data Multiclass

Authors

  • M Aldiki Febriantono Universitas Brawijaya Malang
  • Sholeh Hadi Pramono Universitas Brawijaya
  • Rahmadwati Rahmadwati Universitas Brawijaya

DOI:

https://doi.org/10.21776/jeeccis.v14i1.625

Keywords:

Cost sensitive, Decision Tree, Multiclass, Naïve Bayes.

Abstract

Abstrak– Knowledge discovery is the method of extracting information from data in making informed decisions. Seeing as classifiers do have a lot of learning patterns in the data, testing an imbalanced dataset becomes a major classification issue. The cost-sensitive approach on the decision tree C4.5 and nave Bayes is used to solve the rule of misclassification. The glass, lympografi, vehicle, thyroid, and wine datasets were collected from the UCI Repository and included in this analysis. Preprocessing attribute selection with particle swarm optimization was used to process the data collection. Besides, the cost-sensitive decision tree C4.5  and the cost-sensitive naive Bayes method were used in the research. On the glass, lympografi, vehicle, thyroid, and wine datasets, the accuracy of the test results was 72.34 %, 68.22 %, 75.68 %, 93.82 %, and 93.95 %, respectively, using the cost-sensitive decision tree C4.5. While the cost-sensitive naive Bayes method outperforms the others by 32.24 %, 82.61 %, 25.53 %, 97.67 %, and 94.94 % on the dataset, respectively.

Author Biographies

M Aldiki Febriantono, Universitas Brawijaya Malang

magister program

Sholeh Hadi Pramono, Universitas Brawijaya

Electrical Engineering Departement of Universitas Brawijaya

Rahmadwati Rahmadwati, Universitas Brawijaya

Electrical Engineering Departement of Universitas Brawijaya

References

Larose D, T, “Discovering knowledge in data : an introduction to data mining.†Jhon Wiley & Sons Inc., 2005.

Patel B.N, S.G. Prajapati and K.I. Lakhtaria. â€Efficient Classification of Data Using Decision Tree.†Bonfring international journal of data mining, Vol. 2, No. 1., 2012.

Ali H, M. N. M. Salleh, R. Saedudin, and K. Hussain. “Imbalance class problems in data mining: a review.†Indones. J. Electr. Eng. Comput. Sci., vol. 14, no. 3., 2019. pp. 1560–1571.

Bernard S, C. Chatelain and S. Adam. â€The Multiclass ROC Front method for cost-sensitive classification,†Pattern Recognition, vol. 52., 2015: pp. 46–60.

Wang S. and X. Yao, “Multiclass Imbalance Problems : Analysis and Potential Solutions,†IEEE Trans. Syst. Man. Cybern., vol. 42, no. 4., 2012: pp. 1119–1130.

Jauhari F, A. A. Supianto. “Building student’s performance decision tree classifier using boosting algorithm.†Indones. J. Electr. Eng. Comput. Sci., vol. 14, no. 3., 2019: pp. 1298–1304.

Patel B.R, K.K. Rana. â€A Survey on Decision Tree Algorithm For Classification.†International Journal of Engineering Development and Research, Vol. 2, No. 1., 2014.

Faisal KM, Mofizur RC. D Enhanced classification accuracy on naïve bayes data mining models. International journal of computer applications. 2011;28(3):9-16

Domingos P.†MetaCost: A general method for making classifiers cost-sensitive.†In Proceedings of the Fifth International Conference on Knowledge Discovery and Data Mining. ACM Press., 1999: pp. 155-164.

Wei S, Y.K. Ching, C.S. Chieh and L.Z. Jung. â€Particle Swarm Optimization for Parameter Determination and Feature Selection of Support Vector Machines.†ScienceDirect: Expert System With Aplications., 2008: pp.1817- 1824.

Zhang S., C. Zhang and Q. Yang. â€Data preparation for data mining.†Applied Artificial Intelligence an International Journal, Vol. 17., 2010: pp. 5-6.

Thomas M.C. and Joy A. T.“Elements of imformation Theory,†A John Wiley & Sons, INC., Publication, 2006, pp. 13-14.

Xu Z., Min F., †Cost-sensitive C4.5 with post-pruning and competition.†Artificial Intelligence, 2012.

Chai, X., Deng, L., Yang., Q., et al. “Test Cost Sensitive Naïve Bayes Classification.â€In: Proceedings of the 4th IEEE International Conference on Data Mining. 2004:pp.51-58.

Friedman N.,Geiger., D and Goldezmidt M. Bayesian Network Classifier. Machine Learning. 1997. pp:131-163.

Ramaswati M.,†Validating Predictive Performance of Classifier Models for Multiclass Problem in Educational Data Miningâ€, International Journal of Computer Science Issue,Vol. 11, Issue.5., 2014.

Xue, B., Zhang, M., & Browne, W. N.†Particle Swarm Optimization for Feature Selection in Classification: A Multi-Objective Approach.†IEEE Transactions on Cybernetics, 43(6).,2013:pp. 1656–1671.

Baykara A. “Impact of Evaluation Methods on Decision Tree Accuracy.†M.Sc. thesis., , 2015: pp. 72.

Wang S. and X. Yao, “Multiclass Imbalance Problems : Analysis and Potential Solutions,†IEEE Trans. Syst. Man. Cybern., vol. 42, no. 4., 2012: pp. 1119–1130.

Wei S, Y.K. Ching, C.S. Chieh and L.Z. Jung. â€Particle Swarm Optimization for Parameter Determination and Feature Selection of Support Vector Machines.†ScienceDirect: Expert System With Aplications., 2008: pp.1817- 1824.

Xue, B., Zhang, M., & Browne, W. N.†Particle Swarm Optimization for Feature Selection in Classification: A Multi Objective Approach.†IEEE Transactions on Cybernetics, 43(6).,2013:pp. 1656–1671.

Zhang S., C. Zhang and Q. Yang. â€Data preparation for data mining.†Applied Artificial Intelligence an International Journal, Vol. 17., 2010: pp. 5-6.

Downloads

Published

2020-04-24

How to Cite

[1]
M. A. Febriantono, S. H. Pramono, and R. Rahmadwati, “Perbandingan Metode Cost Sensitive pada Decision Tree dan Naïve Bayes untuk Klasifikasi Data Multiclass”, jeeccis, vol. 14, no. 1, pp. pp. 21–26, Apr. 2020.

Issue

Section

Articles