A multilevel Bayesian analysis of university entrance eligibility for selected districts in Sri Lanka : methods and application to educational data

Multilevel data structures are becoming a commonly encountered phenomenon in educational research. This type of data generates a number of statistical problems, of which clustering is particularly important. To solve the problems inherent in these, special statistical techniques are required. This study aimed to determine the factors affecting the university entrance eligibility of students from some selected districts in Sri Lanka, whilst capturing the layered structure of this educational data into pupil and school levels and determining how these layers interact and impact the dependent variable of interest. This study used university entrance eligibility of General Certificate of Education: Advanced Level (G.C.E) (A/L) student records in 3 districts of Sri Lanka. The response variable is university entrance eligibility of students, which is a binary variable. Thus a two level binary logistic model was fitted using the Bayesian Markov Chain Monte Carlo (MCMC) method as this method has some advantages over other classical statistical methods. When determining the eligibility for university entrance, GCE A/L students find Science subjects more competitive than Arts and Commerce subjects. Students with a higher IQ level (as given with the data) and students with higher English ability stand a better chance. The chance is higher for students from national schools compared to provincial and private schools, and girls show more potential than boys. Students studying in English medium have a higher chance while those studying in Tamil medium have a lower chance compared to the students studying in Sinhala medium.


INTRODUCTION
Multilevel data are structures that consist of multiple units of analysis, one nested within the other.The existence of hierarchies are found in many subject areas, hence the flexibility of multilevel models is reflected in a variety of applications, namely in the fields of medical, biological and social sciences such as educational and political sciences.The goal of multilevel analysis is to account for variance in a dependent variable that is measured at the lowest level of analysis by considering the information from all levels of analysis.Statistical modelling of multilevel data has been in discussion for many years and various developments have been reported on this aspect (Aitkin & Longford, 1986;Goldstein, 1986;Hedeker & Gibbons, 2003).However, most of the early developments were concentrated in the area of continuous response variables.Hence the field of multilevel modelling for discrete categorical responses is a relatively new approach (Goldstein, 1992;Rashbash et al., 2004;Fielding & Yang, 2005).The interest in multilevel models for binary outcome variables is a relatively recent development in sociological analysis.Many sociologists are often interested in explaining and predicting phenomena that can be characterized by a binary variable.This paper is based on the modelling of a binary response in the presence of a multilevel data structure.
The Sri Lankan education system is becoming increasingly competitive, and as such the Advanced Revised: 25 June 2013; Accepted: 19 July 2013 Level (G.C.E) (A/L) examination, which is considered as a prerequisite to enter into a state university has also become very competitive.The prime objective of this study was to determine the factors affecting the university entrance eligibility of students from some selected districts in Sri Lanka, whilst capturing the layered structure of this educational data into pupil and school levels and determining how these layers interact and impact on the dependent variable of interest.Multilevel modelling techniques allow assessing the variation in a dependent variable at several levels simultaneously and addresses how much the university entrance eligibility varies between schools compared with the extent of variation in university entrance eligibility for pupils within schools.Hence it eliminates the possibility of obtaining misleading results due to biased estimates and large standard errors, which can occur by ignoring the clustered structure of the population.Bayesian methods are used in the fitting of the multilevel model due to advantages, which will be explained later.The data for this study was provided by the University Grants Commission of Sri Lanka.All the students who sat for the GCE A/L in the districts of Colombo, Gampaha and Moneragala in the years 2006 and 2007 were considered.However, to begin with the study, possible modification to the original dataset was required to deal with the modelling difficulties and the convergence problems encountered with the MLwiN 2.19 software used in this study.After a heavy data manipulation process, the initial dataset was reduced to 41997 student records within 90 schools.Table 1 gives the variables and their respective categories for each level in this study.

REVIEW OF LITERATURE
Hierarchical data modelling sparked an enthusiasm in the mid 1980's when the influential works of Aitkin et al. (1981) showed that the aggregation over individual observations may lead to misleading results because of violating the heavy assumption of independence.This provoked many researchers to develop systematic approaches to deal with this type of data.While the schooling system presents an obvious example of hierarchical structure, it is very important to account for the multilevel structure of data when measuring the educational performance, to explore the hierarchy of pupils, classes, schools and sometimes also from local education authorities.The existence of such hierarchies cannot be ignored since when the groupings are established, even if at random, the groups tend to differentiate (Steedman, 1983).To ignore this relationship, risks overlooking the importance of group effects may even render invalid many of the traditional statistical methods used to study school effectiveness and the quality of school systems (Goldstein, 1995).Tinkalin (2005) presented the relationships among the social background, gender, attainment and a range of other factors in their secondary analysis of the Scottish School Leavers Survey.Multilevel modelling was used as this allows an assessment of the relationships between different factors and high attainment, as well as an assessment of the variation between schools in their ability to produce high attainers when other potential affecting factors are held constant.Furthermore, initial multilevel techniques were mainly confined to continuous responses, however, the early 1990's showed the extension of multilevel theory and its implementation in software to capably handle different types of outcomes such as binary, nominal scale multi-categorical and ordered categorical.Some striking methods among these were an improved second order approximation proposed by Goldstein (1995), Gauss-Hermite quadrature approximations to maximum likelihood proposed by Hedeker and Gibbons (1994) and Pinheiro and Bates (1995) and a higher order Laplace transformation proposed by Raudenbush et al. (2000).Browne (1998) discussed the varied approaches for fitting multilevel models.With the availability of powerful, high speed computers, Bayesian methods have become computationally feasible with the development of Markov Chain Monte Carlo (MCMC) methods, especially Gibbs sampling (Zeger & Karim, 1991).The advantage of this method of estimation is that in small samples, it takes account of the uncertainty associated with the estimates of the random parameters and can provide exact measures of uncertainty.The maximum likelihood methods tend to overestimate precision because they ignore this uncertainty.In small samples this will be important especially when obtaining 'posterior' estimates for residuals (Goldstein, 2003).These improvements together with the development of more sophisticated software such as MLwiN and STATA have brought multilevel modelling to a new phase of application.

Univariate analysis-using Zhang and Boos test
Prior to fitting any statistical model, it is always important to test the nature and the strength of the relationships between the explanatory variables and the response.Several different methods for assessing these relationships exist and these methods vary according to the nature of the variables in question.However, as noted previously, most traditional techniques for assessing the relationship among categorical variables are not flexible enough to handle stratified data in a multilevel situation.Therefore, traditional univariate techniques such as chi-squared tests will not suffice in the present scenario.Hence, in order to address the above fact it was decided to identify a suitable technique for assessing the nature of the relationship among the variables.Generalized Cochran-Mantel-Haenszel test statistics for correlated categorical data (Zhang & Boos, 1997) provided three different test statistics, which could be used in place of the traditional chi-squared test for testing associations between stratified categorical variables.The work presented a detailed discussion about the suitability of each of the statistics (T p , T u and T EL ).According to the simulation study conducted by Zhang and Boos (1997), T p was found to be the most suitable statistic for unequal strata size as is the case in this study.

March 2014
Journal of the National Science Foundation of Sri Lanka 42(1)

Multilevel model for a binary response variable
This section is concerned about the theory behind multilevel models that have a binary response.In many situations the response variable is not continuous but is instead binary.For example, the interest may be in whether or not a student is university eligible and would have a response variable coded 1=eligible, 0=not eligible.
Similarly one could be interested in the variation in university eligibility in terms of the school.For example, university eligibility may be associated with a student's own characteristics and or by the characteristics of the school they attended.

Structure of the model
A general (one level) logistic model for binary response data is given by Agresti (1990).
Consider a two-level logistic regression model (Goldstein, 2003) with binary response Yij, which equals 1 if student i, in school j was university eligible and 0 if not.Let πij be the probability of student i in school j being eligible for university entry.That is πij = Pr (Yij = 1).
We begin with a random intercept or variance components model that allows the overall probability of university eligibility to vary across schools.If we have a single explanatory variable, xij, measured at the student level, then extended two-level random intercept model is as follows: logit(πij) = β0j + β1xij where β0j = β0 + u0j and u 0j ~ N(0,σ 2 u ) ... (1) For a random intercept model for a binary response, the intercept consists of two terms: a fixed component β0 and a school specific component, the random effect u 0j.Here it is assumed that the u 0j follow a normal distribution with mean zero and variance σ 2 u o (Rashbash et al., 2004).

Fitting the model
In the fitting of model ( 1) the model parameters were first estimated using the iterative generalized least squares (IGLS) method followed by Markov Chain Monte Carlo (MCMC) estimation method.As far as the IGLS method is concerned it uses marginal quasi -likelihood (MQL) versus penalized quasi-likelihood (PQL) approximations to fit binary response models.However the PQL method produces more accurate estimates than the MQL method (Rashbash et al., 2004), therefore PQL (II), which denotes PQL method with second order Taylor series approximation was used as the estimation procedure.A model is estimated from MCMC by setting its monitoring chain length and thinning to required amounts (Gelman & Rubin, 1992).
The Bayesian approach to statistics can be thought of as a sequential learning approach where the prior beliefs/ ideas are combined with the data collected to produce posterior beliefs/ideas to a problem.Then the previous posterior ideas act as prior knowledge and combined with data simulated from the joint posterior distribution using a sampler such as Gibbs sampler.For example, for an unknown parameter θ when the prior beliefs are condensed to a prior distribution p(θ) and when the collected data y (with the distributional assumption) produces a likelihood function L(y‫|‬θ), which is used as the function that maximum likelihood methods maximize.Then the prior distribution and the likelihood function are combined to produce the posterior distribution for θ, p(θ‫|‬y) α p(θ)L(y‫|‬θ) where this equation can be used to reach inferences about θ.However in this approach calculating the proportionality constant is much of an issue.MCMC methods circumvent this problem as it does not calculate the exact form of the posterior distribution but instead produce simulated draws from it.MCMC methods are more general in that they can be used to fit many statistical models.It is a simulation based procedure so that rather than simply producing point estimates, the methods are run for many iterations and at each iteration an estimate for each unknown parameter is produced.These estimates will not be independent, as at each iteration the estimates from the last iteration are used to produce new estimates.However, to choose the starting values it is important to run IGLS or RIGLS before running the MCMC estimation and then the process is simply repeated many times using the previously generated set of parameter values to generate the next set.The chain of values generated by this sampling procedure is known as a Markov chain, as every new value generated for a parameter only depends on its previous values through the last value generated.The field of MCMC convergence diagnostics is concerned with calculating when a chain has converged to its equilibrium distribution.In MLwiN by default the burn-in period is set as 500 iterations (Browne, 2012).MLwiN 2.19 uses the Metropolis Gibbs hybrid estimation method with univariate updates (Browne, 1998).This is briefly described here.By extending the model defined in equation (3) to include several explanatory variables, the form of the model for the i th individual in the j th cluster can be expressed as given in equation ( 4).Here l refers to the number of the explanatory variable.
The joint posterior distribution of the β ι 's is then given by : ]}, where the first part of the product is the joint prior of the β ι 's and the second part of the product is the likelihood function of the data.Here π ij can be substituted from model.This is proportional to the marginal posterior distribution of β ι conditional on the data.

(b) Second level residuals, u oj
u oj ~N(0,σ 2 u) and let the variable containing u oj for j=1,…, k be denoted by u o .Here k is the number of 2 nd level units (schools).
The marginal posterior distribution of u oj is then proportional to : As Browne (1998) considers a model with several random effect terms resulting in a matrix V consisting of many second level variance terms, he takes a general inverse Wishart prior for V, that is V ~ 1W.However our model consists of only one random effect, u oj resulting in a single second level variance term, (σ 2 u ) , the corresponding prior where χ 2 (ν) is the chi-square distribution with ν degrees of freedom and ν =k-2.Here k is the number of 2 nd level units (schools).

Variable Selection
Although many techniques are available for selecting variables for the model in this study, the Bayesian variable selection method was used together with the Wald statistic (Agresti, 1990;Polit, 1996).While Bayesian approaches have been known in ecology and the environmental sciences for some time, using Bayesian approaches were virtually impossible until recently.But the advent of cheap computing has fostered the development of algorithms that provide precise numerical approximations for most problems, making the routine application of Bayes theorem a practical option.
In order to determine the best model, a forward selection procedure was adopted.At each stage the Wald statistic was calculated together with the deviance information criterion (DIC) value to observe the significance of the added factors and to evaluate the fit of the model, respectively.Since the estimation technique used here is not maximum likelihood, the well-known likelihood ratio statistic is not applicable in this scenario.Several criteria have been proposed for use in model comparison and selection.Many proposed criteria have a component that quantifies the goodness of model fit, along with a component that penalizes model complexity; namely, the Akaike information criterion (AIC), the Bayesian information criteria (BIC), and the deviance information criterion (DIC).In the context of a Bayesian hierarchical model, the number of independent parameters included in the model is difficult to determine, which makes the use of AIC or BIC problematic.DIC has been proposed for model comparison in this context.As with all the other criteria, a lower value of DIC is preferred over a higher value.Spiegelhalter et al. (2002), have offered guidelines for using DIC to compare competing models.

Parameter interpretation for a 2 level model with cross level interaction terms
Cross level interaction refers to the interaction between higher level and lower level variables that is, for modification of the effects of lower level variables by characteristics of the higher level units to which the lower level units belong (or vice versa).For example, if the relation between a particular student's level of IQ and English proficiency differs by school characteristics (that is, school and individual level variables interact), there is said to be a cross level interaction.In multilevel models, whenever group specific estimates of the effect of a lower level variable are modelled as a function of higher level variables, a cross level interaction appears in the final model.However, when the numbers of variables at different levels are large, there are vast number of possible cross-level interactions.The odds ratios associated with cross level interactions can be calculated as any other odds ratio (Agresti, 1990).fit of the model.Since the residual analyses do not differ for different multilevel models, the theory associated with it will be presented for the most basic multilevel model with a continuous response (Rashbash et al., 2004).

RESULTS
As explained previously, the dataset in this study takes a hierarchical form with respect to students being clustered within schools.The 'Schools' can be considered as the stratification factor according to which the students are clustered.The response variable termed as 'Eligibility' refers to a binary variable to check whether a particular student is eligible or not.The dataset contains six explanatory variables at the student level, namely, 'Gender', 'Subject Stream', 'IQ Score', 'English Grade', 'Year' and 'Medium' followed by 'School Class Setting' and 'School Sector' being the respective school level variables.All eight variables are categorical variables with the English grade and IQ score being ordinal categorical.Although originally there were 74755 students in the after removing observations with missing values it was reduced to 41997.

Univariate analysis for identifying student level factor impact on the response
The univariate analysis was carried out for student level variables with the intention of identifying the effect of explanatory variables on the response.Since the dataset concerned in this study takes a hierarchical form, the generalized Cochran-Mantel-Haenszel tests for correlated categorical data was used in place of the traditional chi-squared techniques, using schools as the respective stratification factor.The notion of doing a univariate analysis prior to the model fitting is to gain important information about the variables to be included in the model.This adjusted univariate technique used is based on the T p statistic proposed by Zhang and Boos (1997).The required calculations were carried out using the R-macro proposed by De Silva and Sooriyarachchi (2012).Table 2 demonstrates the results of the adjusted univariate test carried out to check the significance of student level covariates in the presence of school as the respective stratification factor.
According to the results in Table 2, it is clear that all the student level variables significantly affect the response variable at 5 % level of significance.Of these the highest significance is observed for the factors Stream and IQ Score.

Fitting a two level random intercept model
Before applying the modelling techniques some modifications were carried out by re-categorizing the variables in order to deal with the sparseness of data, which gave rise to zero observations.In the modelling phase all factors were considered as these were all significant in the univariate phase.The MLwiN software version of 2.19 was used in the model fitting and it sets the level coded with the lower value as the base by default.Also it assigns zero for the coefficient of the base category.Each student level factor was fitted separately and the model was first estimated from the IGLS; PQL (II) method followed by the MCMC method to obtain the Wald statistic and DIC values, respectively.The Wald statistic was calculated for each parameter coefficient and the p values of the statistic were then compared with the 5 % significance level in order to assess the significance of the coefficient.Thus the starting variable was decided based on both Wald statistic and DIC value.However since the factors have an unequal number of categories, using only Wald statistic in factor selection is difficult.Hence some other additional tool is needed to select the most significant factor at each stage.This procedure is continued by adding the second student level term to the factor selected above until it attains the lowest DIC value.However it should be noted that according to the forward selection framework, once a variable is selected to the model, it will not be removed throughout the process.After selecting the student level variables that should be included in the model, the next step is to focus on the school level factors that should be added to the above selected model.A similar procedure as above is carried out in selecting school level factors as well.The model, which results in a lower DIC value together with satisfactory Wald statistic measures is considered to be selected as the best model.According to the significance levels of the coefficients and by considering the DIC values it can be shown that the addition of school level variables to the model with student level variables has resulted in slight increment in the DIC value.Also from the Wald test several levels of those school level variables were found to be insignificant at 5 % level of significance.Then for the selected model, cross level interaction terms were added separately using the forward selection criteria.According to the significance levels of the coefficients and by considering the DIC values, it can be shown that the addition of cross level interaction terms has resulted in a decreased DIC value.The final model includes all the student level factors, namely, stream, IQ code, English grade, medium, gender and year and cross level interactions between Stream and class setting, IQ and school sector and IQ and class setting.Table 3 gives the parameter estimates, standard errors of the estimates and p values associated with the parameters for the fitted model.Also given are the DIC and the parameter estimates and standard errors of the estimates of the fixed and random coefficients.

School level variance component analysis -for the model with interactions
In order to justify whether fitting a multilevel model is sensible, it is advisable to first look at the significance of the school level variance component.This can be checked by the following hypothesis.0 : School level residual variance is zero (σ 2 u = 0) According to the 2.5 th percentile value of 0.087 and 97.can be shown that the above interval does not include the value zero thus we reject H 0 at 5 % level of significance and conclude that the school-level residual variance component is not equal to zero.Thus, the use of a two level model taking school as the respective second level is justifiable.

Residual analysis and classification power of the final model
In order to evaluate the fit of the model, the school level residuals obtained for the final fitted model was analyzed using two graphical tools, namely, the Caterpillar plot and the Normal plot.

Graphical techniques in checking model adequacy:
According to the Caterpillar plot illustrated in Figure 1 most school level residuals contain zero within their 95 % confidence interval.This implies that these schools do not show significant differences in university eligibility from the overall mean predicted by the fixed part of the model.However there are some schools whose residual shows the highest positive deviations and whose residual shows the highest negative deviations.Therefore it can be concluded that these schools have the largest effect on the response.The normal probability plot given in Figure 2 illustrates an approximate straight line indicating that the residuals are approximately normal.

Distributional Assumptions: Anderson Darling normality test results
In order to confirm the results of Figure 2, an Anderson Darling normality test was carried out.
Anderson Darling test for the estimated school-level residuals tests the hypothesis, H 0 : The data is distributed normally H 1 : The data is not distributed normally If the p value obtained for the Anderson Darling statistic is less than 0.05, we reject H 0 at 5 % level of significance to conclude that the data is not distributed normally.However, when the Anderson Darling test was carried out, a p value of 0.702 was obtained.Hence, we do not reject H 0 at 5 % level of significance, and thus conclude that the school-level residuals follow a normal distribution in the fitted model with interactions.
binomial assumption of the data: When the response is the number of times an event occurs out of a fixed number of 'trials', the distribution is typically taken to be binomial.Thus in this study the logistic model was used.The assumption of the binomial distribution can be evaluated using the 'extra binomial' variation (Goldstein, 2003).As a rule of thumb if the extra binomial parameter (EBP) is close to 1, there exhibits no over or under dispersion in the model.This model gives a EBP value of 0.907, which is close to 1 and thus it can be concluded that the binomial assumption is valid.

Interpretation and calculation of the parameter estimates
Having fitted the final model it is of importance to interpret the results obtained.
Consider first the binary predictor, Where; Qual~ Binomial (denom, π ij ) 1, if a student is eligible Qual ={ 0, if a student is not eligible To interpret the particular parameter of interest, the odds ratios are calculated and presented in Table 5.The following section will discuss the impact from all significant terms under the final fitted model on the response variable.
The results of Table 5 indicate the following important conclusions for the 3 districts studied.It was found that the students who are following Commerce in girls' schools have 11 times more odds to enter a university than Bio Science students from girls' schools, while students who are following Arts stream in girls' schools too showed a 6 times higher odds of entering a state university than Bio Science students from girls' schools.However students who are following Combined Mathematics in girls' schools have a 0.69 times less odds to enter to a university compared to those who followed Bio Science in girls' schools.Students who are following Commerce and Arts in boys' schools have 3.5 and 1.5 times more odds, respectively of being university eligible than the students following Bio Science from boys schools.Students who are following Combined Mathematics in boys' schools have 0.56 times less odds to enter a university compared to those who followed Bio Science in boys' schools.The above results may be due to that both female and male students find Science subjects more competitive and challenging than Commerce and Arts subjects.Moreover, students from girls' schools following Combined Mathematics or Arts as a subject have 0.57 and 0.6 times less odds, respectively to be eligible to enter a university than those from mixed schools.Also students from boys' schools following Combined Mathematics have 1.5 times more odds to be eligible to enter into a university than those from mixed schools.The students from boys' schools following Arts as a subject have 0.5 times less odds to be eligible to enter to a university than those are from mixed schools.Furthermore it was found that the odds of eligibility of national schools compared to provincial schools and private schools are higher for all significant IQ score bands.This may be due to the national schools getting more government facilities and support compared to other types of schools i.e.A/L examination based model papers, educational seminars and having more scholarship students.The odds can be quantified as before by using Table 5.
Students who have higher IQ score bands [marks (55-65) and marks (65-74)] in girls' schools are 1.37 times more likely to be eligible to enter a university than who are in mixed schools.Also students who are in the IQ score bands of marks 35-54 and 65-74 in boys' schools are 2.4 and 1.4 times respectively more likely to be eligible to enter a university than who are in mixed schools.
Students who are in provincial girls' schools and provincial boys' schools, private girls' schools and private boys' schools with higher IQ score bands [marks (35-54), marks (55-64) and marks (65-74)] are more likely to be than those with an IQ score band of marks 0-34 and absent for the examination in an increasing order.The odds can be quantified as before by using Table 5.
Accordingly, the students who gained A, B, C and S grades for the English examination have respectively 5, 3, 2 and 1.5 times more odds of being eligible to enter into a university than who failed (F).This may be due to that students who are more knowledgeable in English tend to use additional reading materials/publications, so their ability of acquiring further knowledge may reflect the increased chance to enter a university.Furthermore, the students who followed their examination in English medium have 1.15 higher odds to be eligible to enter a university than the students who did their examination in Sinhala medium.However those students who did their examination in Tamil medium have 0.57 less odds to be eligible to enter a university than the students who did their examination in Sinhala medium.
Female students have a 2.2 higher odds of being eligible to enter to a university than male students.Students who sat for their Advanced Level examination in the year 2007 have a 1.09 times higher odds of being eligible for university, compared to the Particular difficulties faced during the model fitting involved non convergence problems of estimates and the estimated variance converging to negative values.The latter phenomenon however is commonly encountered in multilevel modelling as suggested by Brown and Prescott (2012).But negative variance problem had a major impact in identifying confidence intervals and so on.Also the model selection phase involved in identifying significant variables with unequal number of levels.Hence the use of only Wald statistic resulted in indecisive situations.Thus, another measurement together with Wald test was needed.The Bayesian deviance information criterion (DIC) was found to be a satisfactory measure in this case, however, in order to obtain this statistic it required the model to be refitted under the MCMC method.This required considerable time to get a converged result.
Despite the limitations discussed, the findings of this multilevel study showed promising results, which one should take into consideration to evaluate university entrance eligibility for the selected group of Sri Lankan students.

Further research
The dataset used in this study was restricted to three districts for the reason of confidentiality.It is recommended to apply the techniques described here to a sample of students drawn from all districts in Sri Lanka in order to generalize the findings to the entire Sri Lankan context.This will then be a 3 level study with districts making up the third level.
This research can also be considered as the basis for gaining insights to the need of a sound multilevel goodness-of-fit technique.As discussed in this study, goodness-of-fit tests available in the multilevel context is mostly basic graphical techniques.

Figure 1 .Figure 2 .
Figure 1.Estimated school-level residuals for the final full model School level residuals

Table 1 :
Description of the data and its abbreviations ** school level (1 st level) variables * student level (2 nd level) variables

Table 2 :
Similar to any modelling procedure, multilevel modelling also requires a comprehensive residual analysis in order to validate model assumptions as well as the fit of the model.Even though the model specifications are different for different types of response variables, the definition and the analysis of multilevel residuals is common to all models.It is also important to note that the area of residual analysis and diagnostic testing of multilevel models have not been addressed thoroughly by researches up to date.Hence only the available analytical techniques are used for the purpose of validating model assumptions and the Generalized Cochran-Mantel-Haenszel test ( Tp) results on student level factors with the response * Significant at 5 % level

Table 3 :
Final model parameters with interactions 5 th percentile value of 0.162 for the school level variance, it March 2014 Journal of the National Science Foundation of Sri Lanka 42(1) * Significant at 5 % level Journal of the National Science Foundation of Sri Lanka 42(1) March 2014 Table 4 gives the accuracy according to the fitted model based on classification.

Table 4 :
Classification table

Table 5 :
Odds ratios for each of the significant main effects and interaction terms * Non-significant odds ratios at 5% level