PRE PROCESSING METODE K-NN MENGGUNAKAN OPTIMASI K-MEANS UNTUK KLASIFIKASI IMMUNOTHERAPY PADA PENYAKIT KANKER KULIT
PRE PROCESSING K-NN METHOD USING K-MEANS OPTIMISATION FOR IMMUNOTHERAPY CLASSIFICATION IN SKIN CANCER DISEASE
Kata Kunci:
immunotherapy dataset, boxplot, K-Means, K-NN, outliers, cross validationAbstrak
Abstrack The existence of data outliers in the dataset can cause low accuracy in the classification process. Outliers contained in the dataset can be removed or deleted at the pre-processing stage of the classification algorithm. A method that is often used to detect data outliers is the boxplot. A boxplot is a diagram based on a summary of five numbers, namely the first quartile (Q1), the median or second quartile (Q2), the third quartile (Q3), the minimum value and the maximum value. In this study, pre-processing of the K-NN method will be carried out using K-Means optimization for the classification of immunotherapy in skin cancer. This study uses a dataset of immunotherapy in the treatment of skin cancer, totaling 90 instances, with 8 attributes and 2 classes. The results of this study obtained the highest accuracy rate of 97.76% with an error of 2.24% with the K-NN classification using a 5-fold cross validation scheme. This study also obtained a recall value of 100% with a precision value of 97.57%.
Abstrack The existence of data outliers in the dataset can cause low accuracy in the classification process. Outliers contained in the dataset can be removed or deleted at the pre-processing stage of the classification algorithm. A method that is often used to detect data outliers is the boxplot. A boxplot is a diagram based on a summary of five numbers, namely the first quartile (Q1), the median or second quartile (Q2), the third quartile (Q3), the minimum value and the maximum value. In this study, pre-processing of the K-NN method will be carried out using K-Means optimization for the classification of immunotherapy in skin cancer. This study uses a dataset of immunotherapy in the treatment of skin cancer, totaling 90 instances, with 8 attributes and 2 classes. The results of this study obtained the highest accuracy rate of 97.76% with an error of 2.24% with the K-NN classification using a 5-fold cross validation scheme. This study also obtained a recall value of 100% with a precision value of 97.57%.