Last modified: 2019-10-25
Abstract
An imbalance of employee performance data has more good class data than the class is enough to predict the performance of future employees so that it will affect the accuracy of prediction data accuracy. The sampling method is applied to deal with the problem of data imbalance using the naive bayes, oversampling + naive bayes, and undersampling + naive bayes. In determining the performance results of these methods using the matte confusion validation technique where the highest level of accuracy is found in the naive bayes method by 75% and the highest sensitivity is also in the naive bayes method by 90%, and the highest specificity is the Undersampling + Naive method Bayes and Oversampling + Naive Bayes have the same value of 58%.
Keywords: Naive Bayes, oversampling, undersampling, Data Mining