Implementation and performance comparison of Wilson, holdout, and Multiedit algorithm for edited k-nearest neighbor | Febriana Misdianti | University of Manchester School of Computer Science, UK |

BigData Analysis and Data Mining

September 07-08, 2017

Future Technologies for Knowledge Discoveries in Data

Febriana Misdianti

University of Manchester School of Computer Science, UK

Title: Implementation and performance comparison of Wilson, holdout, and Multiedit algorithm for edited k-nearest neighbor

Biography

Biography: Febriana Misdianti

Abstract

K-nearest neighbor (k-NN) classifier is widely used for classifying data in various segments. However, k-NN classifier has high computational cost because it uses linear search through all training data. In naïve implementation, k-NN will go through all training set i to compute its distance d from input data (O(nd)). Then, it loops again for all training set n to find k smallest results (O(nk)). So, the overall time complexity for k-NN is O(nd+nk). Thus, it is not suitable to classify multidimensional data with a huge number of training set. Meanwhile, k-NN needs a large number of samples in order to work well. There have been several ideas proposed to increase the time performance of k-NN in predicting a test data. One of the popular ideas is by reducing the number of TS in a model so that it would cut the testing time as the number of data that need to be explored becomes smaller. The aim of this experiment is to implement k-NN editing algorithms that cut the number of training data so that it become faster in predicting an input. This experiment implements three editing algorithms, namely, Wilson’s editing, holdout editing, and Multiedit algorithm, and also compare the performance of them.