Estimation of parameters of the 3-component mixture of Pareto distributions using type-I right censoring under Bayesian paradigm

As compared to simple probability models, a mixture model of some suitable lifetime distributions may be more capable of capturing the heterogeneity of nature. In this study, a 3-component mixture of Pareto distributions was investigated by considering the type-I right censoring scheme to obtain data from a heterogeneous population. First, considering a Bayesian structure, some mathematical properties of the 3-component mixture of Pareto distributions are discussed. These mathematical properties include Bayes estimators and posterior risks for the unknown component and proportion parameters using the uninformative (uniform and Jeffreys’) and informative (gamma) priors under squared error loss and DeGroot loss functions. Then, the performance of the Bayes estimators for different sample sizes and test termination times under different loss functions were examined. In addition, limiting expressions of Bayes estimators and posterior risks are derived. Finally, the superiority of the Bayes estimators was established through a simulation study and a real life example.


INTRODUCTION
In the current computational age, experts are able to explain estimates and predict and infer about the complicated structure of interest.In many practical studies, it is observed that the Pareto distribution can be used quite effectively in place of other lifetime distributions.Pareto distribution is often used for modelling many practical phenomena including city population sizes, incomes, All et al. (2003) presented the geometrical properties of the Pareto distribution and Ismail (2004) discussed a simple estimator for the shape parameter of the Pareto distribution.Sankaran and Nair (2005) studied the income analysis perspective.Nadarajah and Kotz (2005) discussed the information matrix for a mixture of two Pareto distributions.The importance of Pareto distribution in modelling different real life phenomena is evident from the above mentioned studies.
In different practical applications, many types of data including simple and grouped data, censored data, progressively censored data and record values are analysed.Censoring is an important and valuable aspect of lifetime data.Due to time and cost issues, it is impossible to continue testing until the last observation.termination time are taken as censored observations.A valuable account on censoring have been given by Romeu (2004) and Gijbels (2010), and many others.have proved to be of considerable interest both in terms of their methodological development and practical applications.Mixture models play a dynamic role in many (1958), for practical purposes the engineer may divide the failures of a system or a device into two or more different types of causes.In order to know the proportion of failure due to a certain cause and to improve the manufacturing process Acheson and McElwee (1952) divided electronic tube failures into gaseous defects, mechanical defects, and normal deterioration of the cathode.

September 2016
Journal of the National Science Foundation of Sri Lanka 44 (3) Bayesian analysis of a 2-component mixture of power distribution using complete and censored samples.Kazmi et al. (2012) described the Bayesian analysis for a 2-component mixture of the Maxwell distributions.Ali et al. (2013) analysed comparisons of the informative priors for the scale parameter of mixture of the Laplace distribution under different loss functions.Feroze and Aslam (2014) presented the Bayesian estimation procedure for analysing the lifetime data under doubly censored sampling using a 2-component mixture of the Weibull distribution.
Motivated by the above mentioned applications of mixture models, in the current study, we studied Bayesian estimation of parameters of a 3-component mixture of the Pareto distributions.All the parameters of a mixture of distributions are assumed unknown.Bayesian analysis was performed by considering different priors and loss functions using direct application of mixture models.In addition, an ordinary type-I right censoring was also applied.

3-Component mixture of Pareto distributions
A random variable Y mixture distribution with q components if the density function of Y can be written in the form: An engineering system is composed of different subsystems, which may be homogeneous or heterogeneous.Single probability models are not capable of capturing the heterogeneity of the nature of such systems.However, heterogeneity in the nature of such systems can be captured through mixture models.
models of some suitable probability distributions receiving attention is, when a population is supposed to comprise a number of subpopulations mixed in an unknown proportion, then the common available distributions are irrelevant, e.g. a population of the lifetime of certain electrical elements or medicines may be divided into a number of subpopulations depending upon the possible cause of failure.
The use of mixture models in situations where data are given only from overall mixture distributions, is known as direct application of the mixture models.Direct applications of mixture models can be seen mostly in medicine, botany, zoology, paleoanthropology, agriculture, economics, life testing, reliability and survival analysis, etc. Li (1983), and Li and Sedransk (1988) discussed different features models, namely, type-I and type-II mixture models.The mixture of probability density functions from the same (different) family is known as type-I (type-II) mixture model.Mixture models have been successfully applied in many areas such as engineering, physical sciences, chemical sciences, biological sciences, etc.Several authors have applied mixture modelling in different mixture distributions to model crime and justice data.Kanji (1985) described wind shear data using mixture distributions, while Jones and McLachlan (1990) applied a mixture of normal distribution and Laplace distribution to model wind shear data.
Most of the researchers focused on Bayesian and classical analysis of 2-component mixture models.Rider (1961) used the method of moments to obtain the estimates of parameters of a mixture of two exponential distributions.Sinha (1998) used the Bayesian approach to estimate the parameters of the 2-component mixture model considered by Mendenhall and Hader (1958).r r r r and the remaining n r observations are censored that give no information as to which subpopulation they belong.lk y , 0 lk y t , be the failure time of the k th 1, 2, , l k r unit belonging to the l th 1, 2, 3 l sub population.The likelihood function for 3-component mixture model can be written as: Situations exist where no prior information on the parameter of interest is available.In such situations, one has to use an uninformative prior distribution.Jeffreys (1946) suggested a method based on the square-root of the Fisher information to determine an uninformative prior.Later on, Geisser (1984) proposed some techniques to determine an uninformative prior.Bernardo (1979) argued that an uninformative prior should be regarded as a reference prior, i.e., a prior that is convenient to use as a standard when analysing statistical data.The most commonly used uninformative priors are the uniform prior (UP) and the Jeffreys' prior (JP).Both priors are used only when no formal prior information is available.That is why, in this study, we assumed the uniform and Jeffreys' priors as the prior distributions.
an informative prior along with the sample information is usually thought of as updating the current information, which helps reducing the posterior risks of the Bayes estimators.It was assumed that the availability of prior information on component parameters as a gamma distribution while a bivariate beta prior distribution is assumed for mixing proportions.

Posterior distribution using the uninformative priors
In this study, it was assumed that the improper UP which is proportional to a constant for the unknown component parameter for the JP.

Posterior distribution using the informative prior
As an informative prior (IP) distribution, gamma distributions were assumed to be the prior distributions for component parameters 1 2 3 , , and the bivariate beta distribution was assumed to be the prior distribution for proportion parameters 1 2 , p p .Symbolically, we have: The joint prior distribution of parameters 1 2 3 1 , , , p and 2 p using the IP is: exp exp The joint posterior distribution of parameters where 13  p under SELF as: Bayes estimators and posterior risks under DLF: , P 1 and P 2 under DLF are obtained as:

Marginal posterior distributions
The marginal posterior distributions of parameters , , , p and 2 p using the UP, the JP and the IP given data y are: where , and take the values as is the usual beta function.(17) where and take the values as: for the JP and 3 v for the IP.

Bayesian estimation under loss functions
In this section, we focus on the derivation of the Bayes estimators and posterior risks using the UP, the JP and the IP under squared error loss function

Elicitation of hyperparameters
Elicitation is a process used to quantify a person's professional belief and knowledge about the subject matter.In Bayesian perspective, elicitation most often arises as a method of specifying the prior distribution of the random parameter(s).Elicitation is simply the parameter(s) so that this can then be combined with the likelihood to obtain posterior distribution for further statistical analysis.In this study, we adopted the prior predictive method based on predictive probabilities suggested by Aslam (2003).For eliciting the hyperparameters, prior predictive distribution (PPD) was used.The PPD using the IP for a random variable On substituting equations ( 1) and (11) in equation ( 26) and then simplifying, we get: Using the prior predictive distribution given in equation ( 27), we consider nine intervals (1, 2), (2, 3), (3,4), (4, 5), (5,6), (6, 7), (7,8), (8,9) and (9, 10) with respective probabilities 0.45, 0.10, 0.05, 0.03, 0.025, 0.02, 0.015, 0.01 and 0.008 as an expert's belief about these intervals.Using equation (27), the following nine equations in (28) are solved simultaneously in Mathematica package for eliciting the hyper parameters

Limiting expressions for complete data set
When the test termination time t tends to , uncensored se all the observations are incorporated in our sample.The limiting expressions for Bayes estimators and posterior risks using the UP, the JP and the IP under SELF and DLF are given in Tables 1 to 4.
observations r tends to sample size n and r l tends to n l (l=1,2,3), so that all the observations which are censored became uncensored in our analysis.So the information contained in the sample is increased and consequently the posterior risks of the Bayes estimators reduced.The

RESULTS
Fr the extent of under-estimation (over-estimation) of the component and proportion parameters (through Bayes estimators) using the UP, JP and IP under SELF and DLF is smaller for larger test termination time (sample size) as compared to smaller test termination time (sample size) at different sample sizes (test termination times).Similarly, the extent of over-estimation (underestimation) of the component and proportion parameters is greater for smaller values of component parameters as compared to larger values of component parameters at different test termination times and sample sizes.Also, the difference of the Bayes estimates from the assumed parameters reduce to zero with an increase in the sample is the case with larger test termination time as compared to small test termination time for varying sample sizes.
It is also observed that the posterior risks of Bayes estimators using the UP, JP and IP under SELF and DLF are reduced with an increase in sample size at different test termination times.For a smaller test termination time, the posterior risks of Bayes estimators are larger than the posterior risks for large test termination time irrespective of the prior, loss function and sample size.Also, the posterior risks of Bayes estimators of component parameters are smaller (larger) for smaller component parametric values under SELF (DLF) for sample size and test termination time considered in the simulation study.However, the posterior risks of Bayes estimators of proportion parameters are larger for smaller component parametric values under SELF and DLF for each sample size and test termination time.
As far as the problem of selecting a suitable prior is concerned, it can be seen that having the smallest associated posterior risk for a given loss function IP emerges as the best prior amongst the different uninformative and informative priors considered in this study.On the other hand, the DLF is observed performing better than SELF for estimating the component parameters, whereas for estimating the proportion parameters, SELF is observed superior to DLF.It should be noted that the selection of the best prior (loss function) for a given loss function (prior) is made based on the posterior risks associated with it.Also, the selection of the best prior and loss function does not depend on the sample size and test termination time.Davis (1952)  x x x x x x x x x ) on lifetimes (in thousand hours) of many components used in aircraft sets.To illustrate the proposed methodology, we take the data on three components, namely, V805 Transmitter Tube, Transmitter Tube and V600 Indicator Tube.Davis showed that data x can be modelled by a mixture of exponential distributions.The transformation y = exp(x) of an exponential random data (x) yields the Pareto random data (y).This transformation allows us to use the Davis mixture data for applying the proposed Bayesian analysis.It is unknown as to which component fails until a failure (of a radar set) occurs at or before the test termination time 0.6 hour.The total number of tests is conducted 1340 times.The data summary required to evaluate the Bayes estimates and posterior risks is given by: 1 Journal of the National Science Foundation of Sri Lanka 44 (3) and their posterior risks assuming the UP, JP and the IP under SELF and DLF are shown in Table 5.

A real life example
From Table 5, it is observed that the results obtained through real life data are compatible with simulation results.Table 5 also reveals that the performance of the IP is better than the UP and JP.Moreover, the results are relatively more precise under UP (JP) than JP (UP) with DLF (SELF).Also, it is observed that SELF (DLF) performance is better than DLF (SELF) for estimating proportion (component) parameters.

CONCLUSION
In this study, we have proposed a 3-component mixture of Pareto distributions to study a lifetime model.We have considered the Bayesian estimation of the 3-component mixture of Pareto distributions using the uninformative (uniform and Jeffreys') and informative (gamma) priors under SELF and DLF.We conducted a comprehensive simulation and a real life study to judge the relative performance of the Bayes estimators and also to deal with the problems of selecting the priors and loss functions at varying sample sizes and test termination times.The numerical results revealed that an increase in sample size or test termination time provides improved (in terms of closeness) and reliable (in terms of posterior risk) Bayes estimators.The extent of over-estimation (underestimation) of the Bayes estimators of parameters is relatively smaller (larger) with relatively larger (smaller) test termination times (sample sizes) at different sample sizes (test termination times).Also, the extent of overestimation (under-estimation) of the Bayes estimators of

1 r , 2 r and 3 r
Sultan et al. (2007) investigated the properties of the 2-component mixture of inverse Weibull distributions.Saleem and Aslam (2009) discussed the use of informative and non-informative priors for Bayesian analysis of a 2-component mixture of the Rayleigh distribution.Also, Saleem et al. (2010) presented the where ; Supposing that n units are used in a life testing t , and lifetime of the units follows a 3-component mixture of Pareto distributions, the experiment is performed and it is observed that r units out of n test termination time t and remaining n r units are still functioning.It may be pointed out that out of r failures, failures belong to subpopulation-I, subpopulation-II and subpopulation-III, respectively, depending upon the reasons of failure.So the number of uncensored observations is 1 2 3

1 ,
B C D E .differentloss functions and searching for minimum posterior risk.Two different loss functions, namely the SELF and the DLF were used to obtain the Bayes estimators and their posterior risks.Bayes estimators and posterior risks under SELF: to develop the least square theory.By using SELF the Bayes estimators and posterior risks are respective marginal posterior distributions yield the Bayes estimators and posterior risks assuming the UP, the JP and the IP for parameters 1 2 3 the Bayes estimators and posterior risks 1 2 3 , , , , a b a b a b a b and c .

Table 2 :
n a c n a b c n a b c Limiting expressions for posterior risks as t under SELF September 2016 Journal of the National Science Foundation of Sri Lanka 44(3)

Table 1 :
Limiting expressions for Bayes estimators as t

Table 3 :
Limiting expressions for Bayes estimators as under DLF

Table 4 :
Limiting expressions for Posterior risks as under DLF t was selected in order to evaluate the impact of the test termination time on Bayes test termination time t were taken as censored ones.The choice of the test termination time is made in such a way that the censoring rate in the resulting sample is approximately 10 % to 25 %.On the basis of the generated sample, Bayes estimates and posterior risks are computed through Mathematica package.The whole procedure is iterated 1000 times.The simulated results are then averaged over 1000 values.The simulated Bayes estimates and posterior risks using the UP, JP and IP under SELF and DLF are showcased

Table 5 :
Davis (1952)tes (BE) and posterior risks (PR) using the UP, the JP and the IP under SELF and DLF withDavis (1952)real life mixture data parameters is less for relatively larger values of component parameters.The posterior risks of Bayes estimators of component (proportion) parameters are smaller (larger) for smaller component parametric values under SELF (SELF and DLF).Moreover, as the sample size (test termination time) decreases (increase) the posterior risks of Bayes estimators of parameters increase (decrease) for the DLF (SELF) is observed as a preferable choice for estimating the component (proportion) parameters.Finally, we conclude that the IP is more suitable and proportion parameters.In case when DLF is selected, the component parameters.Furthermore, the same pattern is observed for real life data.

Table I :
Bayes estimate (BE) and posterior risk (PR) using the UP with 1

Table II :
Bayes estimate (BE) and posterior risk (PR) using the JP with 1

Table III :
Bayes estimate (BE) and posterior risk (PR) using the IP with 1

Table IV :
Bayes estimate (BE) and posterior risk (PR) using the UP with 1

Table V :
Bayes estimate (BE) and posterior risk (PR) using the JP with 1

Table VI :
Bayes estimate (BE) and posterior risk (PR) using the IP with 1