The odd Chen generator of distributions: properties and estimation methods with applications in medicine and engineering

This article is published under the Creative Commons CC-BY-ND License (http://creativecommons.org/licenses/by-nd/4.0/). This license permits use, distribution and reproduction, commercial and non-commercial, provided that the original work is properly cited and is not changed in anyway. Abstract: This paper introduces a new univariate fl exible generator of distributions called the odd Chen-G family, and some of its statistical properties are derived. Two special models of the proposed generator are provided. The model parameters are estimated using six estimation methods, namely, maximum likelihood estimators, least squares estimators, weighted least squares estimators, maximum product of spacings estimators, Cramér-von Mises estimators and percentile based estimators. Further, simulations are performed to compare their performances for both small and large samples. Finally, two real datasets are used to illustrate the fl exibility of the special models of the proposed family.


INTRODUCTION
Numerous classical distributions have been extensively used over the past decades for modelling data in several areas such as medical sciences, life testing problems, biological studies, demography, engineering, actuarial, environmental, and economics. However, in many applied areas such as insurance, survival and reliability theory, there is a clear need for extended forms of these distributions, because, in many practical situations, classical distributions do not provide adequate fi ts to real data. Therefore, there has been an increased interest in developing more fl exible distributions; for example, El-

June 2020
Journal of the National Science Foundation of Sri Lanka 48 (2) among others. Due to the importance of the Ch distribution, we introduce a new class called the odd Chen-G (OCh-G) family of distributions.

METHODOLOGY
Assume T is the lifetime of a system following the Ch CDF in equation (1). If the random variable X represents the odds ratio, the risk that this system will not be working at time x is given by , where represents the CDF of a baseline model with parameter vector .
, where the odds ratio satisfi es the following conditions: 1.

Let
, and , respectively denote the PDF, CDF and RF of a baseline model with parameter vector . Then, the CDF of the OCh-G family is given by ... (03) The PDF of the OCh-G family is given as follows ...(04) Using the power series for the exponential function and the generalised binomial expansion, equations (3)  ... (08) There are following motivations for introducing the OCh-G family: i. To construct heavy-tailed distributions for modelling real data. ii. To defi ne special models with all types of the HRF. iii. To make the kurtosis more fl exible compared to the baseline model. iv. To provide consistently better fi ts than other generated models under the same baseline distribution.
Journal of the National Science Foundation of Sri Lanka 48(2) June 2020 v. To generate distributions with symmetric, leftskewed and right-skewed shaped. vi. To study which is the best method to estimate the model parameters.
Furthermore, we are also motivated to illustrate how diff erent estimators of the special sub models of the OCh-G class perform for various sample sizes and various parameter combinations and to develop a guideline for determining the best method of estimation, which is important for the applied statisticians.

Let
, then the asymptotics of the CDF, PDF and HRF when are, respectively, given by and Further, the asymptotics of the CDF, PDF and HRF when are, respectively, given by ,

Quantile function
Assume X ⁓ OCh-G family, for any , the quantile function, say , of X is the solution of , then ... (09) where represents the baseline quantile function. Setting , we get the median of X.

Moments, skewness, kurtosis and mean deviations
If X ⁓ OCh-G family, the r th moment of X is given by ... (10) where with power parameter . Setting r = 1 in equation (10), we obtain the mean of X. Moreover, we can derive an important measure in survival analysis called the mean time to failure (MTTF) where MTTF = . This measure can be used in order to design and manufacture a maintainable system.
On the other hand, the skewness and the kurtosis can be calculated, respectively, as follows: ... (11) and ... (12) Further, the incomplete moments play an important role for measuring inequality. For example, the fi rst incomplete moment can be used to obtain the formulas of Lorenz and Bonferroni curves. The qth incomplete moment of X can be expressed as follows: ...(13)

June 2020
Journal of the National Science Foundation of Sri Lanka 48 (2) where . On the other hand, the mean deviations about the mean and the median can be represented as and respectively.
Bonferroni and Lorenz curves Bonferroni (1930) presented the Bonferroni and Lorenz curves. These curves have applications in reliability, medicine, demography, economics and insurance. If family, then the Bonferroni curve is given by ... (14) where is the Bonferroni curve of the Exp-G family with power parameter and denotes the average. The Lorenz curve can be expressed as ... (15) where is the Lorenz curve of the Exp-G family with power parameter .

Moments of the residual and past lifetimes
For describing diff erent maintenance strategies, we must calculate two important times, namely, the mean residual lifetime (MRL) and the mean past lifetime (MPL). The moment of the residual lifetime (RL) is given as Therefore, if T ⁓ OCh-G family, then ... (16) where Setting n = 1 in equation (16), we get the MRL.
The moment of the past lifetime (PL) (also known as the moment of the waiting time) can be expressed as ... (17) where Setting = 1 in equation (17), we get the MPL.

The OCh-Weibull (OChW) distribution
Consider the CDF of the Weibull distribution with positive parameters a and b given by .
Then, the CDF of the OChW distribution is ...(19) Figure 3 shows the PDF and HRF plots of the OChW distribution for various values of the parameters. It can be seen that the HRF can be increasing or bathtub shaped. Further, the skewness and kurtosis of the OChW distribution for some choices of a = 2, b = 0.5 and as function of are displayed in Figure 4.
It is clear from Figure 4, that the shapes of the OChW model have strong dependence on the values of and .

Estimation methods
In this section, six methods of estimation are used to estimate the unknown parameters of the OChFr and OChW models to illustrate how diff erent estimators of these distributions perform for various sample sizes and various parameter combinations and to develop a guideline for determining the best method of estimation, which is important for the applied statisticians.
These estimation methods are: the maximum likelihood estimators, least squares estimators, weighted least squares estimators, maximum product of spacings estimators, Cramér-von Mises estimators and percentile based estimators. Similar studies for other models have been proposed by many authors Nassar et al., 2018;Cordeiro et al., 2019).

Maximum likelihood estimators
Let be a random sample of size n from . Then, the maximum likelihood estimators ( Note that the solution of for can be obtained numerically. The weighted least squares estimators (WLSEs) and can be obtained by minimizing the following equation: Further, the WLSEs can also be derived by solving the non-linear equations defi ned by where and are provided in equation (21).

Maximum product of spacings estimators
The maximum product of spacings method is proposed as a good alternative to the MLE method. For , let , be the uniform spacings of a random sample from the OChFr distribution or OChW distribution, where and . The maximum product of spacings estimators (MPSEs) for , , and can be obtained by maximizing the geometric mean of the spacings , with respect to and . Or by maximizing the logarithm of the geometric mean of sample spacings The MPSEs can also be derived by solving the non-linear equations defi ned by where , , and are defi ned in equation (21).

Cramér-von Mises minimum distance estimators
Cramér-von Mises estimators (CVMEs) are a type of minimum distance estimators and have less bias than the other minimum distance estimators. The CVMEs are obtained based on the diff erence between the estimates of the CDF and the empirical distribution function. The CVMEs of the OChFr and OChW parameters are obtained by minimizing with respect to and . Also, the CVMEs follow by solving the non-linear equations where , , and are defi ned in equation (21).

Percentile based estimators
Let be an unbiased estimator of . Hence, the percentile estimators (PCEs) of the OChFr and OChW parameters can be obtained by minimizing with respect to and , where is the quantile function of the OChFr and OChW distributions which are given, respectively, by using equation (9).

Compute the average values of estimates (AEs) and mean-squared errors (MSEs).
June 2020 Journal of the National Science Foundation of Sri Lanka 48 (2) The empirical results are given in Tables 1 and 2, respectively.
Regarding Tables 1 and 2, the following observations can be made: 1. The magnitude of bias always decreases to zero as . 2. The MSEs decrease when the sample size increases as expected under fi rst-order asymptotic theory. 3. Depending on the MSEs, the MLE, LSE, WLSE, MPSE, CVME and PUCE methods perform quite well for estimating the OChFr and OChW parameters. However, we can consider the MLE, WLSE and CVME methods outperform LSE, MPSE and PCE methods. Therefore, the MLE, WLSE and CVME method are the best.

RESULTS AND DISCUSSION: DATA ANALYSIS
In this section, we illustrate the empirical importance of the OChFr and OChW distributions using two applications to real data.
The fi rst dataset (I): represents the survival times, in weeks, of 33 patients suff ering from acute Myelogeneous Leukaemia (Feigl & Zelen, 1965). For the dataset I, we shall compare the fi ts of the OChFr distribution with some competitive models listed in Table 3.
The second dataset (II): represents 40 observations of time-to-failure (10 3 h) of turbocharger of one type of engine (Xu et al., 2003). This dataset is used to compare the fi ts of the OChW model with some competitive models provided in Table 4.
The fi tted distributions are compared using some criteria, namely, the maximized log likelihood (-2L), Akaike information criterion (AIC), Cramér-von Mises (W * ) statistic, Anderson-Darling (A * ) statistic, Kolmogorov-Smirnov (KS) statistic and its p value. Tables 5 and 7 list the MLEs with their corresponding standard errors (SEs) (in parentheses) for both datasets, respectively, whereas Tables 6 and 8 provide the values of goodness-of-fi t measures for datasets I and II, respectively.   The values in Tables 6 and 8 show that the OChFr and  OChW distributions have the lowest values of -2L, AIC, W * , A * and KS measures and then provide the best fi ts to both datasets. Furthermore, the p value test for the OChFr and OChW models have the largest value among all models. Hence, the OChFr and OChW distributions yield a better fi t to those datasets than other distributions.      Table 8: Goodness-of-fi t statistics for dataset II Now, the diff erent methods of estimation mentioned previously will be used to estimate the unknown parameters of the OChFr and OChW models. The KS statistic and its p value are provided to verify the best estimators. Tables 9 and 10 report the estimates of the unknown parameters using fi ve estimation methods and the values of KS with corresponding p value for datasets I and II, respectively.    Tables 9 and 10 illustrate that all estimation methods work quite well. However, the MLE method gives the best estimation for the model parameters and consequently we recommend using it for estimating the model parameters for both datasets. Figures 11, 12 and 13 show the fi tted PDFs, estimated CDFs and P-P plots for both datasets using the estimators in Tables 9 and 10.
Tables 11 and 12 list some numerical values of some reliability concepts for datasets I and II.
It is seen, from Table 11, that the RF and HRF decrease and the MRL increases with t → ∞, whereas the RF and MRL decrease and the HRF increases with t → ∞ as seen from

CONCLUSIONS
In this study, we proposed a new generator of distributions called the odd Chen-G (OCh-G) family. Several of its statistical properties have been derived. The special sub models of the OCh-G family are capable of modelling symmetric and positive as well as negative skewness datasets. Moreover, these sub models provide a wide variation in the shape of the hazard rate, including decreasing, increasing, unimodal and bathtub shapes, and consequently the generated model can be used in modelling various types of data. Two special cases of the OCh-G family, called the OCh-Frechet and OCh-Weibull models were studied. The model parameters are estimated using six diff erent estimation methods, namely, the maximum likelihood estimators, least squares estimators, weighted least squares estimators, maximum product of spacings estimators, Cramér-von Mises estimators and percentile based estimators. The maximum likelihood estimation gives the best estimators for both OCh-Frechet and OCh-Weibull models. Finally, the two special cases of the OCh-G class are applied to two real datasets from the medicine and engineering fi elds to illustrate the fl exibility of the proposed family.