Classification of Succeses in Continuing High School Education Using CART and Random Forest

  • Muhammad Amirullah Yusuf Albasia Department of Statistics, IPB
  • Budi Susetyo Department of Statistics, IPB
  • I. Made Sumertajaya Department of Statistics, IPB

Abstract

Dropout rate in Indonesia has a higher percentage as education levels grow. The percentage of continuing education to senior high school in Indonesia is at 77.50%. Banten is one of the provinces that has a higher dropout percentage when the education level is also higher. Beside that, Banten is the second lowest province in Indonesia in the percentage of continuing education to senior high school that is 68.92%. The study examines importance variables and performance classification that is generated by classification tree and random forest. The results showed that importance variables that is generated by both methods were same, that is per capita expenditure (X8) and proportion of household members who are less educated than senior high school (X10). Then, based on the AUC value that obtained by 10-fold cross validation showed that random forest is better than classification tree. Experiments with values ​​of accuracy, sensitivity, and specificity at some cuts off values ​​also show that random forest can provide more optimum prediction performance than classification tree.

 

Published
2018-08-12