A multivariate approach for developing a dichotomous key for identification and differentiation of Puntius (Osteichthyes: Cyprinidae) species in Sri Lanka

Sri Lanka is a global biodiversity hot spot with a rich freshwater fish fauna. Out of a total of 82 freshwater fish species, the genus Puntius represents 16 species (19.5%). Ambiguities exist in taxonomic identification of the different Puntius species. Hence, in this study a dichotomous key was developed using morphometric and meristic characters to identify and differentiate the Puntius species. Altogether 421 specimens representing different Sri Lankan Puntius species were collected from 38 sites at four different altitude ranges from five major river basins in Sri Lanka. Fifteen meristic characters, four coded variables and twenty three morphometric characters were recorded for each specimen and characters were analysed using principal component analysis (PCA). Six principal components were extracted for meristic characters and coded variables that explained 81.5% of the cumulative variance in the dataset. Two meristic characters (number of transverse scales and number of post dorsal scales) and four coded variables (nature of the lateral line, position of mouth, number of barbels and nature of dorsal fin spines) were the variables that contributed most to the variance of the six principal components identified. The six characters were sufficient in isolation to develop a dichotomous key for all, except for two species. Two principal components extracted only for morphometric characters were also able to differentiate Puntius species but not to the same level as meristic characters and therefore, they contributed less to the dichotomous key developed here. Based on this approach, 15 Puntius species could be differentiated unambiguously. Keywords: Meristic , morphometric, principal component analysis, Puntius. doi :10.4038/jnsfsr.v38i1.1722 J.Natn.Sci.Foundation Sri Lanka 2010 38 (1):15-27


RESEARCH ARTICLE
and clarified the taxonomic status of some Puntius species where previously ambiguities were present in their identification 4,[22][23][24] .
However, there are ambiguities in identification and differentiation of Puntius spp.. Currently, identification of Puntius spp. is based on several characters that incorporate external morphology, morphometric and meristic characters 3,4,19,21 .In most instances identification at the species level is based on a few specimens that may not adequately represent all intra-specific variations present.It is difficult to obtain some characters (i.e.osteological characters) in a short time period and damage the specimen.Descriptions of colour patterns and markings on the body may fade or may not be clearly seen in preserved specimens.These problems have led to misidentification of Puntius species hence the identification of a rigorous set of characters devoid of these failures would enhance the taxonomy of this important fish species.Adopting an approach that screens morphological and meristic characters using large sample sizes and employing appropriate statistical analyses should assist in the identification and discrimination of extant species.
The current study therefore aimed to develop a dichotomous key for Sri Lankan Puntius species, to identify and discriminate species using a set of characters with easily identifiable, non overlapping scores that could be recorded within a short time period with high precision.

METhODS AND MATERIALS
A total of 421 fish from 15 described species of Puntius (of the 16 species recorded in Sri Lanka) were sampled from 38 sites in 5 major river basins from March 2004 to November 2006 (Figure 1).Fish were caught using gape nets, cast nets and scoop nets.Where particular species were considered to be highly endangered or rare, only 2 or 3 individuals were taken for analysis.Specimens were identified in the field to species level using external morphological characters (colour patterns, specific morphological traits and body shape) 4 .Samples were then preserved in 70% alcohol.Additional identification    2 Common * Puntius cumingii 9 + + Common Puntius dorsalis 10 Common Puntius singhala 11 Common Puntius martenstyni 7 + + Rare Puntius nigrofasciatus 9 + + Not yet rare Puntius pleurotaenia 8 + + Common Puntius sarana 2 ¬ Common Puntius srilankensis 12 + + Very rare Puntius ticto 2 Common Puntius titteya 13 + + Common Puntius vittatus 14 Common   was undertaken in the laboratory using standard fish keys and guides 3,4,19,21 .A total of 42 characters of which 23 represented morphometric measurements (Figure 2), 15 represented meristic traits and 4 represented coded variables were scored for all 15 species (Tables 2a & 2b).Particular measurements were made by a single individual to minimize scoring errors.Linear measurements were made using vernier calipers to the nearest 0.01 mm.A Stereo microscope (Wild M5A) and hand lens were used to determine meristic counts and to score coded variables.
Analyses were carried out separately for morphometric and meristic characters as these variables are different from both statistical (morphometric are continuous and meristic are discrete) and biological (morphometric characters can be susceptible to environmental factors while most meristic characters are fixed early during the development) perspectives 25 .Coded characters were converted to a discrete form and also included with meristic characters in the analysis (Table 2b).
As body measurements are often strongly correlated with body length, all morphometric variables were standardized for individual size of each species separately using the following equation 26 .
Where Y′ i = size corrected morphometric variable value for the i th fish, Y i = original value, L i = standard length for i th fish, X = mean standard length for that group of fish and b = slope of regression of log 10 Y on log 10 L i for the fish group (species) considered.Effectiveness of the standardization was checked by correlation analysis of each variable with standard length, and no correlation was observed with individual length.
Meristic characters are commonly determined early during development and have often been reported as being independent of individual size [27][28][29] .Relationship of each meristic character with total length of individual was analysed separately for each species and results showed that the majority of characters were not significantly correlated.A few characters did show significant correlation with total individual length but these characters varied across species.As the present study was focused on inter-specific variation, intraspecific correlations of a small number of variables were not considered to be important.
The general aim of the current study was to identify sets of characters that could differentiate individual Puntius species.As a univariate approach cannot address any joint effect (interactions) of variables, each individual was considered to be a single multivariate observation in the analyses 30,31 .Data used in the analyses were assessed using Principal Component Analysis (PCA) and consisted of pooled data from 42 variables on 15 species.Raw meristic data together with coded character data and size adjusted morphometric data were analyzed separately according to the methodological steps and statistical steps suggested by Gorge and Mallery 32 .Component loadings were obtained by a rotation method with Varimax and normalization.Variables that had no variance and those which contributed comparatively low variance in the analysis where principal components (PCs) were extracted, were excluded from the analysis.Maximum, minimum, mean and standard deviation of each meristic variable and morphometric variables were obtained for each species.All calculations were carried out using statistical software SPSS version 16.

RESULTS
PCA was performed using 23 standardized morphometric data measurements for each individual/species.The first two PCs possessing Eigen values above 1 explained 99% of the cumulative variance.All morphometric variables had positive loadings in the first and second principal components (PC1 and PC2), that explained 94% and 5% of the variance, respectively (Table 3).According to previous studies 33 any components having all loadings (coefficients) of the same sign for a PC is indicative of size variation whereas any component having both positive and negative loadings is indicative of shape variation.As both PCs had positive component loadings in the present study, it could be concluded that they accounted for the size variation among species.High scores for PC1 were associated with position and size of the eye, maximum depth of the body, position of the anal fin, pelvic fin, ventral fin, length and depth of the caudal peduncle and spread of the caudal fin, and length of the caudal fin.Total length and dorsal fin length provided the highest contribution to PC2.
Component scores of each individual fish obtained for PC1 and PC2 separated 15 species in a two dimensional matrix (Figure 3).P. nigrofasciatus, P. bimaculatus and P. chola individuals showed negative component scores for both PC1 and PC2 and their plots were highly separated from the rest of the sampled species occupying three different positions in the plot.P. pleurotaenia and P. martenstyni had positive component scores for both PC1 and PC2 that also separated them from other species but that grouped them closely together.P. titteya, P. ticto, P. bandula and P. vittatus also clustered in close proximity and formed a separate group.

March 2010
Journal of the National Science Foundation of Sri Lanka 38 (1)   Separation of species in the biplot was mainly influenced by the length and depth of the body.Individuals possessing small values for maximum body depth (MBW) and large values for total length (TL), width of caudal fin when fully spread (CSPR), caudal peduncle height (HCPD) and distance between end of dorsal fin to end of the caudal peduncle (LCPD) represented slender/ longer bodied species and grouped above the PC1 axis.Individuals with high values for MBW and small values for TL, CSPR, HCPD and LCPD represented deeper/ shorter bodied species and grouped below the PC1 axis (Figure 3).A Spearman rank correlation between component scores (for PC1 and PC2) and the standardized total length of each individual also indicated that the separation of species was highly dependent on individual size of fish (r 2 = 0.799; p = 0.000.and r 2 = -0.631;p = 0.000 respectively for PC1 and PC2).This separation was not sufficient to discriminate between most species.A comparative analysis of the maximum, minimum values and means of morphometric variables among species (not shown) showed that the values overlapped in most species.
A similar analysis performed with meristic characters with number of pelvic fin spines was removed from the analysis because it did not vary among species, while the number of pelvic fin rays (pfr), number of dorsal fin rays (dfr) and number of anal fin spines (afs) were removed due to comparatively low variance contributions.
Six PCs (Eigen values above 1) were obtained with 81.5% of the cumulative variance explained (Table 4).PC1 explained 34% of total variance.The variables that contributed most to PC1 were the number of lateral line scales (lls), number of barbels (nb), number of pre dorsal scales (prds) and number of post dorsal scales (psds).PC2 explained 14.5% of the variance with nature of lateral line (nll) showing negative component loadings and number of dorsal fin spines (dfs), number of scales around the caudal peduncle (cped) and number of dorsal fin scales (dfsc), positive loadings to the variance.The remaining principal components contributed the remaining 33% to the variance.
A plot of component scores obtained for each individual for PC1 and PC2 showed marked separation of the fifteen species in two dimensional space.Number of lateral line scales determined the grouping of species on the negative and positive sides of the plot along the PC1 axis (Figure 4).Incomplete nature of lateral line determined species that were grouped on the lower quarter of the left side of the biplot.A number of barbels contributed to PC1 and determined the grouping of species between the left and right side above the PC1 axis.Accordingly, P. titteya, P. vittatus and P. cumingii possessed incomplete lateral line, lower scores for lls, psds and prds and were grouped in the lower negative quarter of the biplot.P. nigrofasciatus, P. srilankensis, P. chola and P. dorsalis that possessed one pair of barbels, lower scores for lls and a complete lateral line were grouped in the upper quarter of the left side (Figure 4).P. bandula and P. ticto were also grouped with them but the two species possess an incomplete lateral line.P. sarana, P. pleurotaenia and P. martenstyni possessed two pairs of barbels, a terminal mouth position, complete lateral line and high scores for lls, psds and prds, and  Journal of the National Science Foundation of Sri Lanka 38 (1)  March 2010 were grouped in the upper quarter on the positive side of the biplot.P. bimaculatus and P. asoka also had high scores for lls, a complete lateral line and high psds were placed in the lower quarter on the positive side due to sub terminal position of the mouth and a single pair of barbels.The positions of P. dorsalis, P. sarana and P. singhala were scattered due to high variation in some meristic characters (Figure 4).
Comparison of descriptive statistics for meristic characters (maximum, minimum, mean and standard deviation) and coded variables showed that there were noticeable differences for certain variables among the fifteen species sampled (Tables 5 and 6).Variables with clear differences between species contributed the major part of variance in determining the six principal components (Table 5).This outcome was not evident however, for morphometric variables with most characters showing overlapping values among species.Number of barbels showed factor loading of 0.86 for PC1, while psds and number of transverse scales (tr) showed factor loadings of 0.82 and 0.53 for PC1 and PC5 and explained 34% and 14.5% of the variance, respectively.Nature of lateral line (nll), position of mouth (pom) and nature of dorsal fin spines (ndfs) are variables that possessed comparatively high factor loadings for the second, fourth and fifth principal components, respectively.These six characters were considered therefore as important characters for differentiation of Puntius species.Number of lateral line scales (lls), number of scales around the caudal peduncle (cped), number of pre dorsal scales (prds), number of pre dorsal fin scales (dfsc), number of ventral fin rays (vfr), number of anal fin rays (afr) and number of caudal fin rays (cfr) also showed high factor loadings but provided only limited utility for developing the taxonomic key as different species possessed overlapping values (Table 6).Length ratios pre dordal length/standard length (DFL/SL and head length/ standard length HL/SL) also showed some utility for differentiating specific species where meristic characters either were similar or overlapped between species that limited their use (Table 7).with coded characters can be more effective than morphometric characters for differentiating 15 Puntius species (Figures 3 and 4).As meristic counts are discrete in nature, they were efficient for developing a dichotomous key for Puntius species in Sri Lanka as they gave sharp demarcations between individual species.Some meristic characters overlapped among species however, and were therefore of limited use for distinguishing the species.
Of the 19 meristic characters included in the PCA (inclusive of coded characters), only six characters (nll, pom, ndfs, tr, nb and psds) were used in developing the dichotomous key and these characters could differentiate the 15 Sri Lankan Puntius species successfully.The characters can be scored easily, are distinct and had non-overlapping ranges among species (Table 5).Two species (P.bimaculatus and P. dorsalis) however, could not be fully differentiated using meristic characters in isolation.Combination with the diagnostic morphological characters permitted full separation of all species.Length ratios were employed to remove individual size effects 37 , and in combination with the meristic characters distinguished all species and so were incorporated in the key (Table 7).
In step 11 of Table 8 separation of P. bimaculatus and P. dorsalis were based on two morphological characters (DFL/SL and HL/SL) and one meristic (number of transverse scales) character.HL/SL shows overlaps in the range of 0.26-0.28and DFL/SL in the range of 0.14-0.16 in these two species (Table 7).Number of transverse scales 3.5/2.5 were recorded in three individuals of P. bimaculatus (N=63) and six individuals (N=53) of P. dorsalis.A fish having number of transverse scales 3.5/2.5 and overlapping scores for DFL/SL and HL/SL therefore, limits the separation into a species.Analysis of data set of P. bimaculatus and P. dorsalis indicate that the possibility of this overlap is low, because a fish of 3.5/2.5 transverse scales recorded non overlapping scores for HL/SL or/and post dorsal length/standard length (PDL/ SL).Similarly a fish having overlapping values for HL/SL and/or PDL/SL can be differentiated based on having non overlapping scores for transverse scales.In general therefore, having overlapping values for all three characters is unlikely and these characters individually or in combination could be used or to separate the two species.
Formal description of new species is generally based on data from only a few specimens and hence is not able to represent all intra-specific variation.Intra-specific variation associated with geographical and environmental diversity is well documented in fishes 38,39 .The comparatively large sample sizes per species (except for highly threatened or rare species) used here collected

DISCUSSION
Previous studies 34,35 have shown that morphometric characters are often more suitable than meristic characters for describing intra-specific differences.In another study Ihssen et al. 36  Data for meristic and coded characters obtained for the 15 Puntius species examined in the present study were more or less comparable with the majority of earlier studies 3,4,19,21 in the literature.A difference was evident however, in counts for P. dorsalis, P. pleurotaenia and P. sarana (Table 8).This difference may result from intraspecific geographical variability across the distribution of the species or presence of sub species.Variation in morphological characters in Puntius species has been recorded with altitudinal differences in Sri Lanka 40 .Presence of sub species in Puntius species has also been reported 3 .A difference in the description of P. singhala in the present study compared with earlier reports showed that this species had a terminal mouth (Table 2b).Previous reports suggested that P. singhala had a sub terminal mouth 3,22 .The diet of P. singhala consists of filamentous algae, crustaceans and diatoms 41 that are generally found in the water column and column feeders often are characterised by a terminal mouth.Descriptions of mouth positions can be subjective however, so this may have contributed to the apparent inconsistency.
In the present study, P. singhala individuals were found with no barbels or with a single pair of barbels.According to previous studies 22 this species possess a single pair of barbels but the buccal area also contains many papillae.Therefore, individuals identified as possessing no barbels may possess a pair of rudimentary barbles that may have been be concealed in the papillated area.In the present study P. martenstyni was recorded as possessing a serrated dorsal fin spine during their younger stages and smooth dorsal fin spine when mature.Pethiyagoda 4 has also recorded this difference.Therefore, to avoid any misclassification, this variation was considered when developing the key and there are two identification points for P. singhala and P. martenstyni marked with* in Table 8.
Apart from the 15 Puntius species considered in the present study another species, P. amphibius has been recorded in Sri Lanka.P. amphibius was not included in the current study however, as specimens of this species were not found at any of the 38 sites sampled.P. amphibius was first recorded in 1912 by Dunker and was listed as a freshwater species in Sri Lanka 4,19,21 .According to recent studies 23 P. amphibius is not found in Sri Lanka but has been misidentified by different authors because it possesses similar morphology to other Puntius species.
The dichotomous key developed in this study shows similarities with a key developed by Deraniyagala 19 to identify Puntius species in Sri Lanka.Endemic species, P. bandula, P. srilankensis, P. martenstyni and P. asoka were not recorded that time.Nature of lateral line, ndfs, pom and tr were the main characters used by Deraniyagala 19 to develop his key.In the present study, these characters were among the main characters that contributed to principal components and separating taxa were therefore important for developing the new key.In addition, markings (bands and spots of different shapes and sizes) on the body were also traits considered by Deraniyagala 19 .These characters though important in identification of fresh or live specimens, can be lost or modified when specimens are preserved and these features were not considered here.
In multivariate morphological comparisons there are two independent components, namely size and shape.Species grouped in general in the left half of the PCA biplot possessed comparatively deeper and shorter bodies and species that grouped in the right half possessed slender and longer bodies.The two different size morphs showed (Figure 4) two different body forms; fusiform (slender and long) and ovate (deeper and short).This shape variation may result from adaptation to the different aquatic habitats they occupy [42][43][44] and also to their feeding habits 4,45 .
Although morphometric variables had less power to differentiate the Puntius species when compared with meristic characters, they could differentiate the 15 species to a considerable level (Figure 3).Accordingly, P. pleurotaenia, P. martenstyni and P. bimaculatus grouped above the PC1 axis formed the slender and long bodied group.P. ticto, P. titteya, P. vittatus, P. nigrofasciatus and P. chola grouped below the PC1 axis and formed the deep and short bodied group.The remaining species possessed intermediate morphology and could not be differentiated using these characters.
The results show that meristic characters with coded variables are more effective than morphometric characters for discriminating the 15 Puntius species.To identify a Puntius individual at the species level using the key developed here requires only 2 to 6 steps.The steps need to be followed in a precise manner.In general, this key can assist accurate quantification and assessment of the genus Puntius in Sri Lanka and contribute to their long term conservation.

Figure 2 :
Figure 2: Morphometric characters measured in this study Total length, TL; Standard length, SL; Fork length, FL; Maximum body depth, MBW; Head length, HL; Eye diameter, ED; Distance between pair of nostrils, IND; Inter orbital distance, IOW; Post orbital length, POL; Dorsal fin length, DFL; Pre dorsal length, PDL; Post dorsal length, PODL; Anal fin length, AFL; Pre anal length, PAL; Post anal length, POAL; Pre ventral length, PVL; Post ventral length, POVL; Pre pelvic length, PPL; Post pelvic length, POPL; Caudal fin length, CFL; Width of the caudal fin when fully spread, CSPR; Caudal peduncle height, HCPD; end of the dorsal fin to end of the caudal peduncle length, LCPD

Figure 3 :Figure 4 :
Figure 3: Scatter plot showing individual component scores obtained for PC1 and PC2 for morphometric characters

Table 2b :
Coded variables scored on Puntius species *Characters were quantified as 0, 1 and 2 on a nominal scale and this number was used in the analysis.

Table 5 :
Variability of characters used in developing the key of 15 Puntius species min-minimum; max-maximum; std-standard deviationMarch 2010Journal of the National Science Foundation of Sri Lanka 38(1)

Table 6 :
Maximum, minimum, mean and standard deviation of meristic characters of 15 Puntius species e min-minimum; max-maximum; std-standard deviation Journal of the National Science Foundation of Sri Lanka 38(1)March 2010

Table 7 :
Descriptive statistics of P. dorsalis (N = 54) and P. bimaculatus (N = 63) (Highlighted values were used in developing the key)

Table 8 :
Dichotomous key to separation of 15 Puntius species in Sri Lanka March 2010 from 38 different sites in five major rivers covered a broad geographical range and represented the majority of variation present in the characters assessed.
stated that the discrete nature of meristic data contributed to low ability to discriminate among Halobatrachus didiactylus populations.The present study focused discrimination among species and has shown that variation in meristic characters combined Journal of the National Science Foundation of Sri Lanka 38(1)