Fixed distance neighbour classifiers in brain computer interface systems

A new classification method, which is closely related to k nearest neighbour (kNN) classification method is introduced for identifying cognitive tasks in brain computer interface (BCI) systems. This new method is named fixed distance neighbour (FDN) classifier. Performance of the FDN method is tested with feature vectors derived from EEG datasets recorded for imagery motor movement mental tasks. For comparison purposes, performance of kNN classification method is also tested with the same feature vectors. It was found that FDN performed slightly better than kNN for most of the datasets used in this study, indicating that FDN is a viable classification method, which can be used in place of kNN in BCI systems.


INTRODUCTION
Brain computer interface (BCI) technology is an emerging technology, which enables people with motor disabilities to communicate with the external world.It does not require any peripheral muscular activity.The user of this technology can send commands to communicate or control electronic devices by means of brain activity alone.
In order to control a BCI system, the user must produce different brain activity patterns, which could be identified by the system and translated into commands.In most current BCI systems this identification is carried out by a classification algorithm, which automatically recognizes the class from data as represented by a feature vector.Usually, in EEG based BCI systems, the EEG signal is first preconditioned to remove undesired frequencies.Even after preconditioning, the signal cannot be directly used as a feature vector for classification due to its high dimensionality.Therefore, low dimensional feature vectors are constructed using various feature vector construction techniques.Feature vector construction methods used in BCI systems include amplitude values of EEG signals (Kaper et al., 2004); band powers (BP) (Pfurtscheller et al., 1997); power spectral density (PSD) values (Millán & Mourino, 2003;Chiappa & Bengio, 2004); autoregressive (AR) and adaptive autoregressive (AAR) parameters (Pfurtscheller et al., 1998;Penny et al., 2000); time frequency features (Wang et al., 2004); inverse model-based features (Qin et al., 2004;Kamousi et al., 2005;Congedo et al., 2006); Bi-scale wavelets (Mason & Birch, 2000) and PSD with Welch's periodogram (Barreto et al., 2004).
After constructing feature vectors, suitable classification techniques are used to classify EEG signals.For EEG based BCI systems, commonly used classification algorithms are linear discriminant analysis (LDA), support vector machines (SVM), neural networks (NN), nonlinear Bayesian classifiers and nearest neighbour classifiers (Lotte et al., 2007).There are several versions of nearest neighbour classifiers for solving general classification problems (Domeniconi et al., 2002;Voulgaris & George, 2008).
The efficiencies of these different classifiers are found to be strongly dependent on the kind of BCI used.As an example, Gaussian SVM, LDA, hidden Markov model (HMM), finite impulse response neural network (FIR NN) and Bayes quadratic integrated over time have shown good performances (83 % -89 %) for pure motor imagery based BCI (Lotte et al., 2007).On the other hand, in movement intention based BCI, k nearest neighbours (kNN), multilayer perceptron (MLP), SVM and LDA have shown better performance than other classifiers (Lotte et al., 2007).However, kNN algorithms have not shown very good performance for asynchronous BCI (Lotte et al., 2007).
For most of the BCI systems, the system has to be trained and parameters of feature vector construction and classification schemes have to be determined at the training stage.
In this paper, a fixed distance neighbour (FDN) classifier, which is closely related to kNN is proposed for BCI systems.Its performance with motor imagery based BCI data and the performance of kNN classifier on the same dataset is also presented.

Fixed distance neighbour classifier
In kNN classifier, a decision is made by examining the labels on the k nearest neighbours and taking the majority vote.In order to find the k nearest neighbours for a given test feature point, distances between the test feature point and all the training feature points are calculated and sorted.The metric used for determining distances will depend on the problem of interest.The most commonly used metrics are Euclidian distance metric, cityblock distance metric and correlation distance metric.In the FDN classifier introduced in this paper, a hypersphere of fixed radius (say r c ) centered at the test point is considered and labels of the training points inside the hypersphere are examined.The decision is taken according to the majority vote.
Let x 1 , x 2 , ....... x M be the training set of N dimensional feature vectors and a be a testing feature vector of the same dimension.
Suppose that the distance between test vector a and i th training vector x i is d a,i i = 1, 2, …… M. Note that d a,i can be calculated with any metric as mentioned above.In FDN feature vectors, which participate in classifying the test vector a belongs to the set S a .
for Euclidean metric.The class label of the test vector is predicted as the label of the class, which is having the highest number of feature vectors in S a .
The major difference between kNN and FDN methods can be described as follows.In kNN, since the number of nearest neighbours are fixed, the radius of the hypersphere, which encloses the nearest neighbours can be very different from one test point to another as shown in Figures 1(a) and 1(b).On the other hand, in FDN, since radius r c of the hypersphere is fixed, the number of neighbours participating in voting can be different from one test point to another as shown in Figures 2(a  In FDN, for an arbitrary radius centered at the test point, there is a possibility that hypersphere encloses no test feature vectors or encloses all the test feature vectors.Therefore, the radius r c has to be determined at the training stage such that the radius is large enough to enclose at least the desired number of feature vectors of the training dataset and classification efficiency is maximal.In addition, there is also a possibility that the number of test feature vectors inside the hypersphere is even and can produce a tie for a given test vector.The tie break in FDN can also be carried out by a mechanism as described before for the kNN classifiers.Since optimal k in the kNN method and r c in the FDN depend on the problem of interest, these parameters are determined empirically with the training dataset.
There are two standard techniques, which can be used for determining the parameters of the classification method.In both techniques, while varying the parameters of the classification method, the performance of the system is evaluated and the optimal values of the parameters are obtained.As described below, these two techniques differ in the way the performance is evaluated.
In the first technique, evaluation of the performance of the classification method is carried out by choosing a single feature vector as a test vector from the training dataset and training the system with the remaining training feature vectors.Then the classification method is tested with this single test vector.This is continued by choosing a different feature vector from the training dataset at a time, until every vector in the training data set has participated.At the end, the total performance is calculated.The optimal parameters are determined by varying the parameter values and repeating the calculation.The parameters, which produce the best performance are taken as the optimal parameters.
In the second technique, part of the training dataset is separated as testing data for determining the optimal parameters, while the remaining part of the training dataset is used for training the system.Performance of the classification method was evaluated for the testing dataset while changing the values of the parameters.The parameters, which produced the best performance are taken as the optimal parameters.These optimal parameters are then utilized for evaluating the overall performance of the classification method.

Performance of the FDN
Performance of the FDN was evaluated by using the EEG dataset recorded while subjects were performing imagery motor movements (IMM) mental tasks (Zachary et al., 1990;Pfurtscheller et al., 1998;Birbaumer, 1999;Wang et al., 2004;Pfurtscheller et al., 2006;Lehtonen et al., 2008) and baseline (BL).The baseline signal represents the mental stage of the subject when he/she is not thinking of any specific mental task without eye blinks.The IMM mental tasks are the most popular mental tasks used in BCI due to their good performance.First the EEG data was filtered using bandpass filtering and feature vectors were constructed with band power.For comparison purposes, performance of kNN was also determined for the same feature vectors and shown in the Tables.
IMM consists of two mental tasks, namely, imagery of Left middle finger movement (LFM) and imagery of Right middle finger movement (RFM).EEG recordings for IMM were carried out as follows.
The subjects were seated in an armchairs in front of a white blank screen, which was placed approximately one and a half meters away from them.The main reason for placing this white blank screen in front of the subject was to reduce distractions.They were asked not to pay any attention to the white board but to concentrate on mental task they are performing.During the recording of trials, they were instructed to avoid eye movements and to keep their arms and hands relaxed.If eye blinks or eye moments occurred during the recordings, recorded data in the trial was discarded and the trial was repeated to avoid artifacts in EEG data.

RESULTS
In this investigation 24 feature vector datasets generated from two subjects were used.The parameters used in recordings of the two EEG datasets are as follows (Table 1).
In order to compare the performance of FDN and kNN classifiers, twenty four different feature vector datasets were constructed from two subjects by changing the channels, bandpass filter frequencies and width of frequency bands as shown in Tables 2 and 3.
The procedure for selecting parameters to get the best performance for FDN is as follows.
The sets of all the trials were grouped into two; one (a) group for training and the other for testing.
In order to select optimal parameters, only a part of (b) the training group was used for training the system.With FDN as a classifier, the other part was used for (c) testing the performance while changing the values of the parameters (channels, bandpass filter frequencies and width of frequency bands).The parameters corresponding to the best performance were selected as the optimal parameters.The optimal parameters for kNN were found in a similar manner.
When creating the feature vector datasets D1 to D12, the parameters were chosen such that they produced the best performance for FDN classification method during the training period.
On the other hand, the feature vector datasets D13 to D24 were constructed using, the optimal parameters found for kNN during the training period.
The performance of a given dataset was calculated using the formula given below.The performance of FDN and kNN are shown in Tables 4 and 5. k values for kNN and d c values for FDN given in the tables corresponding to the best performance that were obtained for each method.It is evident from Tables 4 and 5 that FDN method performed better than kNN for most of the feature vector datasets used in the study.

DISCUSSION AND CONCLUSION
The performance of a new classification method, fixed distance neighbour classifier, which is closely related to the well known k nearest neighbour classifier is presented.For most of the twenty four feature vector data sets used in this study, FDN showed slightly better performance than the kNN, indicating FDN is a viable classification method, which can be used in place of kNN.Since we have tested FDN only for feature vectors derived from EEG data sets, which were specifically recorded for BCI, the true merit of FDN as a general classification method cannot be made and further studies have to be carried out with other type of data to determine its general validity.

Figure 1 :
Figure 1: Training feature vector points of a two class problem are represented as Δ and •.Test point is represented as .(a) In kNN classification, when the concentration of points surrounding the test point is large, a sphere with small radius could enclose seven nearest neighbours.(b) When the concentration of the feature vectors is low, a sphere with larger radius is needed to enclose the seven nearest neighbour points.

Figure 2 :
Figure 2: Training feature vector points are as in Figure 1.(a) In FDN classification, when the concentration of points surrounding the test point is large, the sphere with radius d c encloses large number of neighbours.(b) When the concentration of the feature vectors is low, the sphere encloses fewer numbers of neighbour points.
N s -Number of successfully identified mental tasks N -Total number of mental tasks

Table 1 :
Parameters and settings used in all the EEG recording sessions The main difference between these two methods can be summarized as follows.kNN does not consider how far the training dataset points are located from the test point but considers only the number of neighbours.As a result some neighbours, which participated in voting can be far away from the test point due to lack of training data points in the neighbourhood of the test point.On the other hand FDN does consider how far the training data set points are located from the test point, but does not take into account the number of neighbours that participated in voting when classification decision was made.As a result, for different test points, different number of neighbours participated in voting.However, if the distribution of feature vectors in the feature space (i.e.density of feature vectors) is uniform, both methods should produce the same performance.

Table 5 :
Performance of kNN and FDN for datasets D13 -D24

Table 4 :
Performance of FDN and kNN for datasets D1 -D12