Mapping ilmenite deposit in Pulmudai, Sri Lanka using a hyperspectral imaging-based surface mineral mapping method

E.M.M.B. Ekanayake1*, S.S.P. Vithana2, E.M.H.E.B. Ekanayake1, A.R.M.A.N. Rathnayake1, A.M.R. Abeysekara1, T.S.J. Oorloff1, H.M.V.R. Herath1, G.M.R.I. Godaliyadda1, M.P.B. Ekanayake1 and A. Senaratne3 1 Department of Electrical and Electronic Engineering, Faculty of Engineering, University of Peradeniya, Peradeniya. 2 Sri Lanka Technological Campus, Ingiriya Road, Padukka. 3 Department of Geology, Faculty of Science, University of Peradeniya, Peradeniya.


INTRODUCTION
A hyperspectral image contains information corresponding to a large number of continuous wavelengths. This facilitates the analysis and detection of intricate details of a given image. As the existence of a particular surface mineral results in a different composition of soil, compared to that of the places where the respective surface mineral is unavailable, imageprocessing techniques can be used to detect possible deposits of surface minerals. The use of satellite images is a non-intrusive and economical way for surface mineral detection as it avoids strenuous land surveys and soil testing. Since hyperspectral data contains richer information in the form of contiguous spectral bands compared to multispectral images, the detection process is more accurate and reliable.
A number of papers have been published in the recent past, proposing different methods of determining possible surface mineral deposits using hyperspectral imagery. Applicability of hyperspectral data to determine trends in hydrothermal alteration intensity in and around the Izok Lake Volcanogenic Massive Sulfide (VMS) deposit in Northern Canada has been discussed by Laakso et al. (2016). An efficient procedure for mineral mapping is discussed by Notesco et al. (2014), with a unique hyperspectral remote-sensing fingerprint in the September 2019 Journal of the National Science Foundation of Sri Lanka 47 (3) longwave infrared spectral region enabling identification of the most abundant minerals in the continental crust -quartz and feldspars. The presence of copper in Basavanakote, Karnataka was found using hyperspectral remote sensing techniques by Aravinth & Roopa (2017).
A new method has been introduced to extract features using mineral absorption by Zhao et al. (2017). Here, Reference Spectral Background Removal (RSBR) has been introduced into mineral absorption feature extraction from high vegetation density areas. Diagnostic absorption feature has the potential to be the key factor in mineral information extraction. Saralıoğlu et al. (2016) have attempted to explore minerals with hyperspectral image fusion. Due to the insufficient spatial resolution of current hyperspectral image sensors, a single pixel might include more than one mineral. This paper has revealed a method to overcome this problem by fusing such an image with a high-resolution image. Niranjan et al. (2016) have presented a study of minerals and vegetation in explored fields around the San Juan coal mines west of Farmington, New Mexico. Here, Minimum Noise Fraction (MNF) and Pure Pixel Index (PPI) methods are used for the extraction of endmember fraction, and spectral signature matching procedure is done with the United States Geological Survey (USGS) spectral library. Swamy et al. (2017) have carried out a dimension reduction based algorithm on complex wavelet filter bank and the mimetite mineral spectral signatures, and classified using the orthogonal subspace projection method. The desired mimetite spectral signature is segregated from the mixed pixels and estimated in the presence of two other minerals with the help of complex wavelet filter bank and spectral matching. Vithana et al. (2019) have exploited the relative proximity of spectral signatures among classes of remotely sensed hyperspectral images in order to generate an adaptive hierarchical structure for image classification. This enables a level-by-level optimisation for clustering at each stage of the hierarchy. Ekanayake et al. (2019) have proposed a novel method for hyperspectral unmixing, in which the fundamental notions of independent component analysis are utilised to improve the accuracy of the standard non-negative matrix factorisation algorithm. This method proves to be highly accurate with regard to hyperspectral image classification.
This paper presents a method of detecting possible ilmenite deposits in the Northeastern region of Sri Lanka, using concepts related to data engineering, mathematics, machine learning and signal processing, on a hyperspectral image of the respective region obtained by the Earth Observing-1 satellite's Hyperion sensor. The detection algorithm first deals with classifying the pixels in the region under its sub components. It was observed that the selected geographical region constitutes mainly of vegetation, water bodies, paddy fields, beaches with sand, and soil. Thus, three main classes, i.e. 'vegetation', 'water bodies' and 'sparse vegetation and soil' were defined. Here, the class 'sparse vegetation and soil' covers areas where there are paddy fields, soil and sand. Throughout the rest of this paper, we refer the 'sparse vegetation and soil' class as 'soil'. Thus, the primary intention is to classify the pixels in the selected image region under the relabelled classes, 'vegetation', 'water bodies' and 'soil'. Then, the pixels containing soil are extracted for further analysis in order to find out the regions containing ilmenite. A spectral signature for the composition of soil where ilmenite deposits exist was created using data available in spectral libraries. With the aid of the reference spectral signature created, a correlation analysis was carried out to select a training sample of pixels, which were then used to develop the detection algorithm. The results of the analysis were validated by a field visit, where soil samples were collected from several locations predicted by the algorithm. The lab results confirmed the presence of ilmenite in locations detected by the proposed method.

Dataset
The hyperspectral image that was used in this analysis was captured by the Hyperion sensor (USGS, EO1H1410532005260110PU_SGS_01) attached to the NASA's Earth Observing -1 (EO -1) satellite. The Hyperion sensor has the capability of resolving spectral bands from 0.4 µm to 2.5 µm -belonging to visible, near infrared and short wave infrared regions of the electromagnetic spectrum, with a spatial resolution of 30 m and a swath width of 7.5 km. The Hyperion sensor provides for two grating image spectrometers with detailed spectral mapping across all channels with high radiometric accuracy. These two spectrometers, VNIR (Visible and Near-Infrared) and SWIR (Short Wave Infrared) had been used to obtain the hyperspectral images consisting of 242 bands spanning across a range of wavelengths, from 355.59 -2577.08 nm (Griffin, 2005).
As seen in Figure 1(a), the hyperspectral image subjected to analysis in this paper is a strip along the Northeastern region of Sri Lanka, which covers a geographical area of approximately 7.5 × 100 km 2 . Its true colour version is shown in Figure 1 The respective hyperspectral image strip was obtained from the USGS database (USGS, EO1H1410532005260110PU_SGS_01). The Level 1Gst (L1Gst) product was used since all radiometric and systematic geometric corrections are performed in the L1Gst product. In addition, a 90-meter Digital Elevation Model (DEM) has been employed in this particular product. The image was available in the form of 242 monochrome Geo 'TIFF' images belonging to each of the 242 spectral bands. The size of the image was 7481 × 1851 pixels and each pixel covered an area of 30 × 30 m 2 . The data included in the images were radiance values and were processed by a scaling factor of 40 for the images obtained through the VNIR camera in the range of wavelengths from 400 -1400 nm (bands 1-70), and by a scaling factor of 80 for the images obtained by the SWIR camera in the range of wavelengths from 900 -1700 nm (bands 71-242). There were 44 uncalibrated bands among the 242 bands (USGS, EO1H1410532005260110PU_SGS_01).
In order to reduce the time of computation, algorithms were performed only on a selected region -the region displayed in Figure 1(c), which spanned from the rows 5627 to 5801 of the original image.

Preprocessing
Preprocessing is an essential procedure that should be done in remote sensing applications. It plays an important role, being a stage done prior to any form of data analysis and hence is important in hyperspectral image analysis as well. It essentially prepares and structures the dataset to a form, which could be input directly to the algorithm used, in a much efficient and effective manner. Simply, it converts the dataset to a more qualitative and productive form.

Conversion from radiance to reflectance
Hyperspectral image analysis uses reflectance values since it is the property, which varies with the chemical composition of materials. In addition, reduction in between-scene variability can be achieved through a normalisation for solar irradiance. Thus, spectral radiance values were converted to reflectance values by using the following equation (USGS, EO1H1410532005260110PU_SGS_01). The strength of the conversion from radiance to reflectance was acceptable when considering the mean spectral signatures of the three classes as in Figure 2(a) for the spectral bands with high signal to noise ratio (SNR).
was acceptable when considering the mean spectral signatures for the spectral bands with high SNR. = unit less planetary reflectance accurate results. Thus, the influence of the smiling effect could be considered negligible.
characteristics of the contents of the pixel, the dataset was standardized by normalizing with respect to the standard deviation (Oorloff et al., 2017) using e Smiling Effect was not taken into consideration. Solely addressing the biases w and standard deviation gave accurate results. Thus, the influence of the smili considered negligible.
... (2) normalizing with respect to the standard deviation (Oorloff et al., 2017) using e Smiling Effect was not taken into consideration. Solely addressing the biases w and standard deviation gave accurate results. Thus, the influence of the smili considered negligible.
Where, _ � ( , : ) = i th pixel with zero mean spectral data, _ ( , : ) = i th pixel with original spectral data, � = mean of the i th pixel's spectral data _ � ( , : ) = i th pixel with normalized spectral data, � = standard deviation of the i th pixel's spectral data Subtracting the mean of spectral data of each pixel from its original data was do any biases in pixels. Normalizing with respect to standard deviation was done minimizing the effect of power variation across the spectral bands of each pix these calculations, a standardized dataset was obtained to be input to the algorit Classification Algorithm -Sub Component Analysis Where, _ � ( , : ) = i th pixel with zero mean spectral data, _ ( , : ) = i th pixel with original spectral data, � = mean of the i th pixel's spectral data _ � ( , : ) = i th pixel with normalized spectral data, � = standard deviation of the i th pixel's spectral data Subtracting the mean of spectral data of each pixel from its original data was do any biases in pixels. Normalizing with respect to standard deviation was done minimizing the effect of power variation across the spectral bands of each pix these calculations, a standardized dataset was obtained to be input to the algorit

Classification Algorithm -Sub Component Analysis
The first stage of the algorithm follows a similar approach to what is discussed Subtracting the mean of spectral data of each pixel from its original data was do any biases in pixels. Normalizing with respect to standard deviation was done minimizing the effect of power variation across the spectral bands of each pix these calculations, a standardized dataset was obtained to be input to the algorit

Classification Algorithm -Sub Component Analysis
The first stage of the algorithm follows a similar approach to what is discussed (2018). A straightforward method, which considers the vector repres characteristics of a pixel, has been used. After removing the 44 uncalibrated calibrated bands were remaining so that a single pixel has a spectral signature rep values corresponding to 198 bands. Thus, a single pixel can be represented = standard deviation of the i th pixel's spectral data Subtracting the mean of spectral data of each pixel from its original data was done in order to remove any biases in pixels. Normalising with respect to standard deviation was done with the intention of minimising the effect of power variation across the spectral bands of each pixel. After performing these calculations, a standardised dataset was obtained to be input to the algorithms developed.

Classification algorithm -sub component analysis
The first stage of the algorithm follows a similar approach to what is discussed by Ekanayake et al. (2018) and Vithana et al. (2019). A straightforward method, which considers the vector representation of spectral characteristics of a pixel, has been used. After removing the 44 uncalibrated bands, 198 properly calibrated bands were remaining so that a single pixel has a spectral signature representing reflectance values corresponding to 198 bands. Thus, a single pixel can be represented in a 198-dimensional space, which is impossible to illustrate yet mathematically explicable, where each axis represents a spectral band in reflectance units.
The intention of the initial stage was to classify the pixels in the selected image region under its underlying classes. Moreover, the algorithm discussed by Ekanayake et al. (2018) identifies the pixels that contain more than one component (mixed pixels) along with their percentages. Since this algorithm requires normalised data, the standardisation mentioned in the earlier section is essential.

Standardisation of the dataset
The image chosen for analysis consists of reflectance values of spectral bands obtained for each pixel. Therefore, it is obvious that the dataset could be affected due to factors such as different lighting conditions, different atmospheric conditions, measurement errors, and noise. Therefore, the spectral data of identical pixels would have the possibility of being different, which would lead to inaccuracies in the analysis. In order to remove or minimise the effect of such factors that are not characteristics of the contents of the pixel, the dataset was standardised by zero meaning and normalising with respect to the standard deviation (Oorloff et al., 2017) using equations (2) to (5). The smiling effect was not taken into consideration. Solely addressing the biases with respect to mean and standard deviation gave Since a large amount of data introduces redundancy and complexity to the analysis, principal component analysis (PCA) (Tyo et al., 2003;Raiko et al., 2007;Zabaiza et al., 2014) was used to transform the 198-dimensional space into a 20-dimensional space where the variation of the dataset was maximum. To find the transformation matrix in PCA, a training set of 60 pixels was chosen, 20 from each class -soil, vegetation and water bodies. The training set was selected with the aid of Google Earth Pro and the Image Processing toolbox available in MATLAB ® . Equations (6) to (10) were used to implement PCA and transform the dataset on to a 20-dimensional space.

11
that contain more than one component (mixed pixels) along with their percentages. Since this algorithm requires normalized data, the standardization mentioned in the earlier section is essential.
Since a large amount of data introduces redundancy and complexity to the analysis, Principal Component Analysis (PCA) (Tyo et al., 2003;Raiko et al., 2007;Zabaiza et al., 2014) was used to transform the 198-dimensional space in to a 20-dimensional space where the variation of the dataset was maximum. To find the transformation matrix in PCA, a training set of 60 pixels was chosen, 20 from each class -soil, vegetation and water bodies. The training set was selected with the aid of Google Earth Pro and the Image Processing toolbox available in MATLAB ® . Equations (6) to (10) were used to implement PCA and transform the dataset on to a 20-dimensional space.
Where, = covariance matrix, = expected value operation, � = vector representing the spectral band information of a pixel, Since a large amount of data introduces redundancy and complexity to the analysis, Principal Component Analysis (PCA) (Tyo et al., 2003;Raiko et al., 2007;Zabaiza et al., 2014) was used to transform the 198-dimensional space in to a 20-dimensional space where the variation of the dataset was maximum. To find the transformation matrix in PCA, a training set of 60 pixels was chosen, 20 from each class -soil, vegetation and water bodies. The training set was selected with the aid of Google Earth Pro and the Image Processing toolbox available in MATLAB ® . Equations (6) to (10) were used to implement PCA and transform the dataset on to a 20-dimensional space.
Where, = covariance matrix, = expected value operation, � = vector representing the spectral band information of a pixel, ... (7) 11 Component Analysis (PCA) (Tyo et al., 2003;Raiko et al., 2007;Zabaiza et al., 2014) was used to transform the 198-dimensional space in to a 20-dimensional space where the variation of the dataset was maximum. To find the transformation matrix in PCA, a training set of 60 pixels was chosen, 20 from each class -soil, vegetation and water bodies. The training set was selected with the aid of Google Earth Pro and the Image Processing toolbox available in MATLAB ® . Equations (6) to (10) were used to implement PCA and transform the dataset on to a 20-dimensional space.

11
transform the 198-dimensional space in to a 20-dimensional space where the variation of the dataset was maximum. To find the transformation matrix in PCA, a training set of 60 pixels was chosen, 20 from each class -soil, vegetation and water bodies. The training set was selected with the aid of Google Earth Pro and the Image Processing toolbox available in MATLAB ® . Equations (6) to (10) were used to implement PCA and transform the dataset on to a 20-dimensional space.
Where, = covariance matrix, = expected value operation, � = vector representing the spectral band information of a pixel, = covariance matrix 11 was maximum. To find the transformation matrix in PCA, a training set of 60 pixels was chosen, 20 from each class -soil, vegetation and water bodies. The training set was selected with the aid of Google Earth Pro and the Image Processing toolbox available in MATLAB ® . Equations (6) to (10) were used to implement PCA and transform the dataset on to a 20-dimensional space.
Where, = covariance matrix, = expected value operation, � = vector representing the spectral band information of a pixel, = expected value operation 11 was maximum. To find the transformation matrix in PCA, a training set of 60 pixels was chosen, 20 from each class -soil, vegetation and water bodies. The training set was selected with the aid of Google Earth Pro and the Image Processing toolbox available in MATLAB ® . Equations (6) to (10) were used to implement PCA and transform the dataset on to a 20-dimensional space. The eigenvalues and the eigenvectors of the covariance matrix were considered in order to construct the transformation matrix. The eigenvectors corresponding to the largest 20 eigenvalues of the covariance matrix were stacked as rows to form the transformation matrix. Thus, the transformation matrix is given by equation (9). Using this transformation matrix, all pixels of the image were transformed to the new reduced dimensional space using the transformation given by equation (10).
Where, = vector representing the mean spectral information of all pixels = vector representing the mean spectral information of all pixels, = number of pixels in the training sample, = eigenvector of the covariance matrix, = eigenvalue of the covariance matrix.
The eigenvalues and the eigenvectors of the covariance matrix were considered in order to construct the transformation matrix. The eigenvectors corresponding to the largest 20 eigenvalues of the covariance matrix were stacked as rows to form the transformation matrix. Thus, the transformation matrix is given by equation (9). Using this transformation matrix, all pixels of the image were transformed to the new reduced dimensional space using the transformation given by equation (10). The eigenvalues and the eigenvectors of the covariance matrix were considered in order to construct the transformation matrix. The eigenvectors corresponding to the largest 20 eigenvalues of the covariance matrix were stacked as rows to form the transformation matrix. Thus, the transformation matrix is given by equation (9). Using this transformation matrix, all pixels of the image were transformed to the new reduced dimensional space using the transformation given by equation (10). The eigenvalues and the eigenvectors of the covariance matrix were considered in order to construct the transformation matrix. The eigenvectors corresponding to the largest 20 eigenvalues of the covariance matrix were stacked as rows to form the transformation matrix. Thus, the transformation matrix is given by equation (9). Using this transformation matrix, all pixels of the image were transformed to the new reduced dimensional space using the transformation given by equation (10).
Where, = eigenvalue of the covariance matrix The eigenvalues and the eigenvectors of the covariance matrix were considered in order to construct the transformation matrix. The eigenvectors corresponding to the largest 20 eigenvalues of the covariance matrix were stacked as rows to form the transformation matrix. Thus, the transformation matrix is given by equation (9). Using this transformation matrix, all pixels of the image were transformed to the new reduced dimensional space using the transformation given by equation (10).
= vector representing the mean spectral information of all pixels, = number of pixels in the training sample, = eigenvector of the covariance matrix, = eigenvalue of the covariance matrix.
The eigenvalues and the eigenvectors of the covariance matrix were considered in order to construct the transformation matrix. The eigenvectors corresponding to the largest 20 eigenvalues of the covariance matrix were stacked as rows to form the transformation matrix. Thus, the transformation matrix is given by equation (9). Using this transformation matrix, all pixels of the image were transformed to the new reduced dimensional space using the transformation given by equation (10). The eigenvalues and the eigenvectors of the covariance matrix were considered in order to construct the transformation matrix. The eigenvectors corresponding to the largest 20 eigenvalues of the covariance matrix were stacked as rows to form the transformation matrix. Thus, the transformation matrix is given by equation (9). Using this transformation matrix, all pixels of the image were transformed to the new reduced dimensional space using the transformation given by equation (10).  The eigenvalues and the eigenvectors of the covariance matrix were considered in order to construct the transformation matrix. The eigenvectors corresponding to the largest 20 eigenvalues of the covariance matrix were stacked as rows to form the transformation matrix. Thus, the transformation matrix is given by equation (9). Using this transformation matrix, all pixels of the image were transformed to the new reduced dimensional space using the transformation given by equation (10). In this 20-dimensional space, the direction of each vector characterises the spectral behaviour of the pixel. However, the magnitude of the vector could vary due to the effects of the environment, sensing equipment, etc. Therefore, the directions of the vectors were considered as the basis for the classification task. In order to normalise the magnitude, each vector was divided by its Euclidean norm using equation (11). Now all the pixels have a unit magnitude and as mentioned above, the spectral characteristics of each pixel are denoted by the direction of the vector representation in the 20-dimensional space.  (13). The unmixed pixels were se largest affinity percentage as in equation (14).
= unit vector representing the reference spectral signature of water Although the distance vector was calculated, the requirement is to obtain a measure of affinity. Thus, the reciprocal of the distance vector was considered and an affinity vector was calculated, which represented the percentage affinity of a given pixel to the reference spectral characteristics of the three classes using equation (13). The unmixed pixels were selected using a threshold for the largest affinity percentage as in equation (14).

September 2019
Journal of the National Science Foundation of Sri Lanka 47 (3) 14 the three classes using equation (13). The unmixed pixels were selected using a threshold for the largest affinity percentage as in equation (14).
Where, = percentage affinity vector, = maximum affinity percentage of the affinity vector, and the other terms are the same as before.
The classification criteria used was,  If ≥ 50 %, the considered pixel was classified under the class, which has the maximum affinity.
Where, = percentage affinity vector, = maximum affinity percentage of the affinity vector, and the other terms are the same as before.
The classification criteria used was,  If ≥ 50 %, the considered pixel was classified under the class, which has the maximum affinity.
... (14) Where, Where, = percentage affinity vector, = maximum affinity percentage of the affinity vector, and the other terms are the same as before.
The classification criteria used was,  If ≥ 50 %, the considered pixel was classified under the class, which has the maximum affinity.
= percentage affinity vector Where, = percentage affinity vector, = maximum affinity percentage of the affinity vector, and the other terms are the same as before.
The classification criteria used was,  If ≥ 50 %, the considered pixel was classified under the class, which has the maximum affinity.
= maximum affinity percentage of the affinity vector and the other terms are the same as before.
The classification criteria used were,  If 50 %, the considered pixel was classified under the class, which has the maximum affinity.  If 50 %, the pixel was labelled as a 'mixed pixel' and the percentage of each end member (soil, vegetation and water) was obtained by the percentage affinity vector.
Figure 2(b) shows the classified image according to the above criteria. It is clearly seen that these results show an accurate correlation with the true colour version of the image depicted in Figure 1(c). Using the above criteria, the pure unmixed soil pixels were determined. These pixels were noted in order to perform the next step of the algorithm.

Identification of probable ilmenite deposits
The coastline along the Northeastern region of Sri Lanka around Pulmudai is rich in mineral sand, especially ilmenite. With the intention of identifying probable ilmenite deposits using hyperspectral images obtained by the Hyperion sensor of the Earth Observing -1 satellite, an algorithm was developed using correlation coefficient analysis and Fisher's discriminant analysis (FDA). The accuracy of the results was further enhanced by utilising a probability-based approach dependent on the spatial distribution of the ilmenite deposit.

Correlation analysis
According to Premaratne & Rowson (2003), the mineral percentages of the soil in the coast around Pulmudai, which is our target region of analysis is as in Table 2.
Then, a reference signature, which is shown in Figure 3, was constructed to represent the soil, which probably would consist of ilmenite. The construction of the reference spectral signature was done by obtaining the lab spectral signatures of the constituents from USGS spectral library (Kokaly et al., 2017) and by weighing Journal of the National Science Foundation of Sri Lanka 47 (3) September 2019 each of the spectra by their respective percentages depicted in Table 2, after normalising to remove any biases in the signatures.
the Hyperion sensor. The SNR of the Hyperion sensor depicted in Figure 3(b) shows that the SNR fluctuates with the wavelength whilst having significantly low SNR values at certain ranges of wavelengths. As the data in the regions of low SNR values would not yield useful information and would distort the results, the data in the regions of SNR values below 50 were removed from both the Hyperion dataset as well as the reference signature created.
Ilmenite has unique spectral characteristics, which is explicitly seen even in compounds consisting of ilmenite. This can be clearly seen when comparing the spectral signatures of various types of minerals available in the USGS spectral library (Kokaly et al., 2017). Hence, in the process of obtaining a training set of pixels for ilmenite deposits in the region, first, the pixels containing soil were isolated using the classification algorithm  discussed in the earlier section of this paper. Then, a correlation coefficient analysis was carried out between the created reference spectral signature and the pure soil pixels classified by the aforementioned algorithm, using equation (15).
Ilmenite has unique spectral characteristics, which is explicitly seen even in compounds consisting of ilmenite. This can be clearly seen when comparing the spectral signatures of various types of minerals available in the USGS spectral library (Kokaly et al., 2017). Hence, in the process of obtaining a training set of pixels for ilmenite deposits in the region, first, the pixels containing soil were isolated using the classification algorithm discussed in the earlier section of this paper. Then, a correlation coefficient analysis was carried out between the created reference spectral signature and the pure soil pixels classified by the aforementioned algorithm, using the equation (15). Where,

18
Ilmenite has unique spectral characteristics, which is explicitly seen even in compounds consisting of ilmenite. This can be clearly seen when comparing the spectral signatures of various types of minerals available in the USGS spectral library (Kokaly et al., 2017). Hence, in the process of obtaining a training set of pixels for ilmenite deposits in the region, first, the pixels containing soil were isolated using the classification algorithm discussed in the earlier section of this paper. Then, a correlation coefficient analysis was carried out between the created reference spectral signature and the pure soil pixels classified by the aforementioned algorithm, using the equation (15). = normalised matrix of which the rows correspond to the spectral data of the soil pixel Ilmenite has unique spectral characteristics, which is explicitly seen even in compounds consisting of ilmenite. This can be clearly seen when comparing the spectral signatures of various types of minerals available in the USGS spectral library (Kokaly et al., 2017). Hence, in the process of obtaining a training set of pixels for ilmenite deposits in the region, first, the pixels containing soil were isolated using the classification algorithm discussed in the earlier section of this paper. Then, a correlation coefficient analysis was carried out between the created reference spectral signature and the pure soil pixels classified by the aforementioned algorithm, using the equation (15). = correlation coefficient between the reference spectrum and the spectrum of the i th soil pixel Ilmenite has unique spectral characteristics, which is explicitly seen even in compounds consisting of ilmenite. This can be clearly seen when comparing the spectral signatures of various types of minerals available in the USGS spectral library (Kokaly et al., 2017). Hence, in the process of obtaining a training set of pixels for ilmenite deposits in the region, first, the pixels containing soil were isolated using the classification algorithm discussed in the earlier section of this paper. Then, a correlation coefficient analysis was carried out between the created reference spectral signature and the pure soil pixels classified by the aforementioned algorithm, using the equation (15).

= normalised reference spectrum
Ilmenite has unique spectral characteristics, which is explicitly seen even in compounds consisting of ilmenite. This can be clearly seen when comparing the spectral signatures of various types of minerals available in the USGS spectral library (Kokaly et al., 2017). Hence, in the process of obtaining a training set of pixels for ilmenite deposits in the region, first, the pixels containing soil were isolated using the classification algorithm discussed in the earlier section of this paper. Then, a correlation coefficient analysis was carried out between the created reference spectral signature and the pure soil pixels classified by the aforementioned algorithm, using the equation (15).

= mean and standard deviation of the i th soil pixel
Ilmenite has unique spectral characteristics, which is explicitly seen even in compounds consisting of ilmenite. This can be clearly seen when comparing the spectral signatures of various types of minerals available in the USGS spectral library (Kokaly et al., 2017). Hence, in the process of obtaining a training set of pixels for ilmenite deposits in the region, first, the pixels containing soil were isolated using the classification algorithm discussed in the earlier section of this paper. Then, a correlation coefficient analysis was carried out between the created reference spectral signature and the pure soil pixels classified by the aforementioned algorithm, using the equation (15).

= mean of the reference spectrum
Ilmenite has unique spectral characteristics, which is explicitly seen even in compounds consisting of ilmenite. This can be clearly seen when comparing the spectral signatures of various types of minerals available in the USGS spectral library (Kokaly et al., 2017). Hence, in the process of obtaining a training set of pixels for ilmenite deposits in the region, first, the pixels containing soil were isolated using the classification algorithm discussed in the earlier section of this paper. Then, a correlation coefficient analysis was carried out between the created reference spectral signature and the pure soil pixels classified by the aforementioned algorithm, using the equation (15). = standard deviation of the reference spectrum K = index of the spectral bands n = number of spectral bands Thereafter, based on the correlation values obtained, 20 soil pixels that were highly correlated with the reference spectral signature were taken as the training set of ilmenite deposits, and the 20 least correlated pixels were taken as the training set for non-ilmenite deposits. The spectral signatures of the training sets are shown in Figure 4, and it could be seen that the training sets of ilmenite deposits and non-ilmenite soil pixels are distinct.

Transforming to a new space
Increasing the separability of classes would ease classification in the process of identifying the probable ilmenite deposits. Hence, Fisher's discriminant analysis was used to transpose the pixels on to a new space in which the separability of the two classes -ilmenite and non-ilmenite, is increased. Fisher' discriminant analysis (Imani & Ghassemian, 2015;Sugiyama, 2016; was applied on the aforesaid training set using equations (16) to (19). The idea of using Fisher's discriminant analysis is to transform the dataset into a new space where the scatter between the two classes is maximised while the scatter within a class is minimised. The equations below were used to construct the Fisher's transformation matrix. ...(17) ...(18) Where, S b = between-class scatter matrix,  Where, S b = between-class scatter matrix μ = vector representing mean spectral information of all pixels μ i = vector representing mean spectral information of the pixels of class i n i = number of training pixels in class i l = number of classes S w = within-class scatter matrix x j = vector representing the spectral information of a pixel in the training class i x i = vector representing the spectral information of a pixel in the training set n = total number of training pixels Since it is required to minimise the within-class scatter and increase the between-class scatter, the transformation matrix was constructed by considering the eigenvalues and eigenvectors of the Fisher matrix given by equation (20). Using these eigenvectors, data were transformed into a new space with reduced dimensionality using equations (21) and (22). In order to classify the pixels into the two classes, the spectral similarity of each pixel with the mean spectral signature of the training set of each class was calculated using equation (24).
Where, x represents a pixel vector and |x| represents its magnitude.
Where, �� = spectral similarity of the i th pixel's spectral signature and the k th class (k = 1 for ilmenite and k = 2 for non-ilmenite), _ � ( , : ) = the normalized spectral signature of the i th pixel, _ � = mean reference spectral signature of the training set of the k th class.
Finally, each pixel was labeled as ilmenite or non-ilmenite based on the spectral similarity measure, . If the spectral similarity between a certain pixel and the mean reference spectral signature of ilmenite class was higher than the spectral similarity between that particular pixel and the mean reference spectral signature of non-ilmenite class, then that pixel was classified under 'ilmenite', and vice-versa.

Probability-based approach
In the Pulmudai region, ilmenite has a larger probability to be found in the coastal area. Statistically, there is a larger probability within 600m from the coast, and from there onwards, the probability of finding ilmenite falls gradually. Thus, a probability density function was approximated in such a way that the probability of finding ilmenite was constant until 600m. From there onwards, the probability was approximated to fall exponentially as shown in Figure 5. The spatial information used to generate the probability density function can be found in Overstreet (1972).
= spectral similarity of the i th pixel's spectral signature and the k th class (k = 1 for ilmenite and k = 2 for non-ilmenite) Where, �� = spectral similarity of the i th pixel's spectral signature and for ilmenite and k = 2 for non-ilmenite), _ � ( , : ) = the normalized spectral signature of the i th pixel, _ � = mean reference spectral signature of the training set of th Finally, each pixel was labeled as ilmenite or non-ilmenite based on the spect . If the spectral similarity between a certain pixel and the mean reference ilmenite class was higher than the spectral similarity between that particula reference spectral signature of non-ilmenite class, then that pixel was classifie vice-versa.

Probability-based approach
In the Pulmudai region, ilmenite has a larger probability to be found in the coa there is a larger probability within 600m from the coast, and from there onwa finding ilmenite falls gradually. Thus, a probability density function was appro that the probability of finding ilmenite was constant until 600m. From there on was approximated to fall exponentially as shown in Figure 5. The spatial inform the probability density function can be found in Overstreet (1972).
= the normalized spectral signature of the i th pixel Where, �� = spectral similarity of the i th pixel's spectral signature and the for ilmenite and k = 2 for non-ilmenite), _ � ( , : ) = the normalized spectral signature of the i th pixel, _ � = mean reference spectral signature of the training set of the k Finally, each pixel was labeled as ilmenite or non-ilmenite based on the spectral . If the spectral similarity between a certain pixel and the mean reference sp ilmenite class was higher than the spectral similarity between that particular p reference spectral signature of non-ilmenite class, then that pixel was classified u vice-versa.

Probability-based approach
In the Pulmudai region, ilmenite has a larger probability to be found in the coasta there is a larger probability within 600m from the coast, and from there onwards finding ilmenite falls gradually. Thus, a probability density function was approxim that the probability of finding ilmenite was constant until 600m. From there onwa was approximated to fall exponentially as shown in Figure 5. The spatial informati the probability density function can be found in Overstreet (1972).
= mean reference spectral signature of the training set of the k th class Figure 7: Block diagram of steps followed to extract ilmenite from the soil samples The remaining soil samples were inserted to the FRANTZ Magnetic Separator Model L-1 (S. G. FRANTZ CO, 2018) while the current was controlled at 0.4A. Ilmenite was distinctly isolated by the separator.
A hand magnet was used to separate magnetite mineral from the obtained soil samples, as the magnetic separator used for ilmenite extraction at the next stage cannot be used with soil samples containing magnetite mineral.
After drying the samples, they were passed through a sieve to remove the larger soil particles.
The soil samples were thoroughly washed with distilled water to remove the impurities.
Journal of the National Science Foundation of Sri Lanka 47 (3) September 2019 Finally, each pixel was labelled as ilmenite or nonilmenite based on the spectral similarity measure, α. If the spectral similarity between a certain pixel and the mean reference spectral signature of ilmenite class was higher than the spectral similarity between that particular pixel and the mean reference spectral signature of nonilmenite class, then that pixel was classified under 'ilmenite', and vice versa.

Probability-based approach
In the Pulmudai region, ilmenite has a larger probability to be found in the coastal area. Statistically, there is a larger probability within 600 m from the coast, and from there onwards, the probability of finding ilmenite falls gradually. Thus, a probability density function was approximated in such a way that the probability of finding ilmenite was constant until 600m. From there onwards, the probability was approximated to fall exponentially as shown in Figure 5. The spatial information used to generate the probability density function can be found in Overstreet (1972).
Using this novel probability-based method, the classification was done while considering the probability of finding ilmenite in each pixel as well. Thus, it is expected to obtain results with more bias towards the coastal region of the geographical area and less bias as the distance increases from the coast.
In order to classify the pixels in the region into the two classes, using this novel probability-based approach, a similar method was used as earlier, but with a slight modification. This time, the probability to find ilmenite at each pixel was also considered. First, as in equation (24), the spectral similarity measure, α was calculated. Then equation (25) was used to calculate the novel joint measure.

23
Figure 5 -Probability density function of the probability of finding ilmenite against the distance measured from the coast Using this novel probability-based method, the classification was done while considering the probability of finding ilmenite in each pixel as well. Thus, it is expected to obtain results with more bias towards the coastal region of the geographical area and less bias as the distance increases from the coast.
Using this novel probability-based approach, in order to classify the pixels in the region into the two classes, a similar method was used as earlier, but with a slight modification. This time, the probability to find Ilmenite at each pixel was also considered. First, as in equation (24), the spectral similarity measure, α was calculated. Then equation (25) was used to calculate the novel joint measure.
Where, β �� = A joint measure of the spectral similarity of the i th pixel's spectral signature and the k th class (k = 1 for ilmenite and k = 2 for non-ilmenite) and the probability of finding Ilmenite on the region denoted by the i th pixel ... (25) Where,

23
Figure 5 -Probability density function of the probability of finding ilmenite against the distance measured from the coast Using this novel probability-based method, the classification was done while considering the probability of finding ilmenite in each pixel as well. Thus, it is expected to obtain results with more bias towards the coastal region of the geographical area and less bias as the distance increases from the coast.
Using this novel probability-based approach, in order to classify the pixels in the region into the two classes, a similar method was used as earlier, but with a slight modification. This time, the probability to find Ilmenite at each pixel was also considered. First, as in equation (24), the spectral similarity measure, α was calculated. Then equation (25) was used to calculate the novel joint measure.
Where, β �� = A joint measure of the spectral similarity of the i th pixel's spectral signature and the k th class (k = 1 for ilmenite and k = 2 for non-ilmenite) and the probability of finding Ilmenite on the region denoted by the i th pixel = A joint measure of the spectral similarity of the i th pixel's spectral signature and the k th class (k = 1 for ilmenite and k = 2 for nonilmenite) and the probability of finding ilmenite on the region denoted by the i th pixel Figure 5 -Probability density function of the probability of finding ilmenite against the distance measured from the coast Using this novel probability-based method, the classification was done while considering the probability of finding ilmenite in each pixel as well. Thus, it is expected to obtain results with more bias towards the coastal region of the geographical area and less bias as the distance increases from the coast.
Using this novel probability-based approach, in order to classify the pixels in the region into the two classes, a similar method was used as earlier, but with a slight modification. This time, the probability to find Ilmenite at each pixel was also considered. First, as in equation (24), the spectral similarity measure, α was calculated. Then equation (25) was used to calculate the novel joint measure.
Where, β �� = A joint measure of the spectral similarity of the i th pixel's spectral signature and the k th class (k = 1 for ilmenite and k = 2 for non-ilmenite) and the probability of finding Ilmenite on the = spectral similarity of the i th pixel's spectral signature and the k th class (k = 1 for ilmenite and k = 2 for non-ilmenite) to find Ilmenite at each pixel was also considered. First, as in equatio measure, α was calculated. Then equation (25) Where, β �� = A joint measure of the spectral similarity of the i th pixel's spectra = 1 for ilmenite and k = 2 for non-ilmenite) and the probabili region denoted by the i th pixel = Probability of finding ilmenite on the region denoted by the i th pixel Finally, each pixel was labelled as ilmenite or nonilmenite based on the joint measure, β. If the joint measure between a certain pixel and the mean reference spectral signature of the class, ilmenite was higher than the joint measure between that particular pixel and the mean reference spectral signature of the class, nonilmenite, then that pixel was classified under 'ilmenite', and vice-versa.

RESULTS AND DISCUSSION
The results generated using spectral similarity measure, α was mapped as shown in Figure 6(a). The results generated using the novel probability-based algorithm considering the joint measure, β was mapped as shown in Figure 6(b). As expected, the probability-based approach based on the spatial distribution of the ilmenite deposit has given more bias towards the coastal region of the geographical area and less bias as the distance increases from the coast. Thus, the results obtained from the probability-based approach were taken into consideration for validation.
In order to check whether the identified pixels actually contained ilmenite, soil samples were collected from a few locations recognised by the probability-based algorithm. A total of eight locations were selected from the map in Figure 6(b), and their soil samples were taken for lab testing. The steps depicted in Figure 7 were followed during the testing process.
The weight measurements taken during the testing were used to calculate the ilmenite percentage of each sample. The results are shown in Table 3.
The areas, which were not identified to exist ilmenite from the probability-based algorithm showed no visual observations of existence of ilmenite. Thus, soil samples from those areas were not taken for laboratory testing.
Hyperspectral imaging is an emerging technique used for feature extraction and classification using hyperspectral image data. This stands out from other image processing techniques due to its ability to identify fine features of an image, which is not detectable by other image processing techniques. This is mainly used in land cover mapping, facial recognition, agriculture, and military applications.

September 2019
Journal of the National Science Foundation of Sri Lanka 47 (3) In this paper, the main focus was to develop a universal algorithm to detect ilmenite mineral based on the unique spectral characteristics of the areas where the particular mineral is located. Based on that, an algorithm was built and the lab testing proves that the locations detected by the novel algorithm actually contained ilmenite.
The classification algorithm used in classifying the pixels containing soil was a combination of PCA and comparison of affinity based on Euclidean distance. The classification process resulted in an image similar to the true RGB image. It could further be concluded that the part of the image of a strip in Northeastern Sri Lanka is mainly covered by soil, followed by foliage. Water bodies are much less in the region of consideration.
In the second part of the algorithm, the soil only pixels were solely taken into consideration. A reference spectral signature for ilmenite deposits was created using lab data and it was compared with the pixel signatures of the image. Using a correlation coefficient analysis, a training set was formed by including the highest correlated and the lowest correlated soil only pixels. Then Fisherʼs discriminant analysis was performed to further separate the two classes -ilmenite, and non-ilmenite and the classification was done by comparing the spectral similarity of each pixel with the mean spectral signatures of the two classes. In order to further enhance the results, a probability-based approach was utilised. A probability density function was designed to determine the probability of finding ilmenite with the distance Journal of the National Science Foundation of Sri Lanka 47 (3) September 2019 measured from the coast. Results were generated taking the probability of finding ilmenite into account as well.
Eight locations were chosen from those results and soil samples were collected for lab testing, which ultimately proved that all those eight locations contained ilmenite.
A mineral sands deposit, which is suitable for mineral extraction typically has a heavy mineral grade (HM grade) ranging from 0.5 % to above 20 %. It was observed that all eight locations considered for testing contained ilmenite within that range. Some locations contained very high relative percentages of ilmenite (sample no. 3 -12.27 % and sample no. 4 -10.73 % as seen in Table 3). These amounts are suitable for the extraction of the mineral for commercial use. Thus, this indicates that the algorithm established gives accurate results in finding probable ilmenite deposits, which could be used for commercial purposes.
Prior evidence suggested of an ilmenite deposit in the Pulmudai region of Sri Lanka. Thus, the study was based on that particular region only. In addition, due to the lack of hyperspectral databases, the number of satellite hyperspectral images covering the areas in Sri Lanka are minimal. This prevented us from looking into the adjacent areas for the possibility of the existence of ilmenite.

CONCLUSION
The objective of developing an algorithm to detect ilmenite deposits in Pulmudai region using hyperspectral images is successful. This novel algorithm, which is based on feature reduction and classification methods, signal processing techniques and statistical and mathematical tools, proves to give highly accurate results. Lab testing performed on the collected soil samples further validates the results.