Bivariate Gompertz generator of distributions: statistical properties and estimation with application to model football data

MS Eliwa, ZA Alhussain, EA Ahmed, MM Salah, HH Ahmed and M El-Morshedy 1 Department of Mathematics, College of Science, Majmaah University, Majmaah 11952, Saudi Arabia. 2 Department of Mathematics and Statistics, Faculty of Science, Mansoura University, Mansoura 35516, Egypt. 3 Department of Administrative and Financial Sciences, Taibah University, Community College of Khyber 41941, Saudi Arabia. 4 Department of Mathematics, Sohag University, Sohag 82524, Egypt. 5 Department of Basic Science, Preparatory Year Deanship, King Faisal University, Hofuf, Al‐Ahsa, 31982, Saudi Arabia. 6 Department of Mathematics, College of Science and Humanities in Al-Kharj, Prince Sattam bin Abdulaziz University, Al-Kharj 11942, Saudi Arabia. 7 Department of Mathematics, Faculty of Science, Mansoura University, Mansoura 35516, Egypt.


INTRODUCTION
Several classes of distributions have been developed and applied to describe various phenomena in diff erent areas such as engineering, biological studies, economics, actuarial, environmental, lifetime analysis and Olympic games, among others.
However, in many applied areas such as lifetime analysis, describing the pattern of adult deaths, Olympic games and insurance, there is a clear need for extended forms of these classes to model such data. For this reason, many classes have been proposed and studied in statistical literature, for example, transformed-transformer (T-X) family by Alzaatreh et al. (2013); generating T-Y family by Aljarrah et al. (2014); exponentiated halflogistic family by Cordeiro et al. (2014); Kumaraswamy Marshall-Olkin family by Alizadeh et al. (2015); a new Weibull-G family by Tahir et al. (2016); Gompertz-G family by Alizadeh et al. (2017) and its discrete version by ; exponentiated Gompertz generated family by Cordeiro et al. (2016); odd Chen-G family by El-Morshedy et al. (2020a); exponentiated odd Chen-G June 2020 Journal of the National Science Foundation of Sri Lanka 48 (2) family by Eliwa et al. (2020b); odd fl exible Weibull-H family by El-Morshedy and ; odd loglogistic Lindley-G family by  and discrete Gompertz family by , among others.
In many practical situations, it is important to consider diff erent bivariate families that could be used to model bivariate data. The bivariate data could be exchange rates in two time periods, strength components, results of two teams in Olympic games etc. Therefore, many bivariate distributions are proposed in literature, for example, bivariate generalised exponential distribution by Kundu and Gupta (2009); bivariate generalised linear failure rate distribution by Sarhan et al. (2011) Marshall-Olkin bivariate Weibull distribution by Kundu and Gupta (2013); bivariate Kumaraswamy distribution by Barreto-Souza and Lemonte (2013); bivariate exponential distribution by Balakrishnan and Shiji (2014) Eliwa and El-Morshedy (2020b), bivariate Burr X generator by El-Morshedy et al. (2020c), among others. However, in many practical situations, classical bivariate distributions do not provide adequate fi ts to real data. Therefore, there has been an increased interest in developing more fl exible distributions. Thus, in this paper, we introduce a fl exible bivariate family based on Marshall-Olkin shock model (Marshall & Olkin, 1967), in the so-called bivariate Gompertz-H (BGo-H) family. Alizadeh et al. (2017) proposed and studied a fl exible univariate family of distributions, in the so-called Gompertz-G (Go-H) family. The random variable Y is said to have Go-H family if its CDF is given by ... (01) where θ > 0 and α > 0 are two additional shape parameters, and η is a vector of parameters (1× k ; k = 1, 2, 3, ...). Also, the is the baseline CDF depending on a parameter vector . The survival function of the random variable Y is given by The probability density function (PDF) corresponding to equation (1) is given by is the baseline PDF. The main reasons for introducing this family are: 1. The joint CDFs and joint PDFs should preferably have a closed form representation; at least numerical evaluation should be possible. 2. This class of distributions is an important model that can be used in a variety of problems for modelling bivariate lifetime data. 3. This class contains several special bivariate models.

METHODOLOGY: BGO-H FAMILY
A random vector follows the Marshall-Olkin shock model ↔ there exist three independent random variables and such that and or and . The proposed BGo-H family is constructed from three independent Go-H families using a minimisation process. Assume three mutually independent random variables such that i = 1, 2, 3. Defi ne and . So, the bivariate vector has the BGo-H family with parameter vector . The joint survival function of is given as follows

BGo-H family using Marshall-Olkin copula properties
In this section, we fi nd that the BGo-H family has both a singular part along the line with weight and an absolute continuous part on with weight , similar to Marshall and Olkin's bivariate exponential model. Moreover, the BGo-H family can be obtained by using the Marshall-Olkin copula with the marginals as the Go-H families as follows: for , we get ... (16)  where and . For more details around Marshall-Olkin copula properties see, Nelsen, 1999. Also, we fi nd that ... (17) So, if follow the BGo-H family, then they are positive quadrant dependent.
Note: For every pair of increasing functions and , we get (Barlow & Proschan, 1975).

Coeffi cient of median correlation
Assume and denote the median of and , respectively. If and , then ... (18) where U has a uniform U(0, 1) distribution, and represents the baseline quantile function. Domma (2010) presented the median correlation coeffi cient as a form . So, the coeffi cient of median correlation between and is given as follows: ... (19) Moments, product moment and covariance In this section, we derive the rth moment, the nth central moment and the sth incomplete moment of when , such that i = 1, 2. Also, we present the product moment, covariance and the of the bivariate distribution The rth moment of say , can be expressed as follows , using equation (10), we get ... (20) where be a random variables having the exp-H CDF with power parameter (l+1). The moments of the exp-H distributions are given by Nadarajah and Kotz (2006). Setting in equation (20), we get the mean and the variance as ...(21) and ... (22) , respectively. Furthermore, the nth central moment of say , is given by On the other hand, the incomplete moments are very important, which the main applications of the fi rst incomplete moment are related to the mean deviations, Bonferroni and Lorenz curves. These curves are very useful in demography, economics, medicine, insurance and reliability. The sth incomplete moment of , say Journal of the National Science Foundation of Sri Lanka 48 (2) June 2020 can be expressed as follows: ... (24) where . So, the mean deviations about the mean and the median are given by and , respectively. Moreover, the product moment, say , can be represented as ... (25) where for , and So, by using equations (20) and (25) (6) and (9) in the relation

Stress-strength reliability function
There are appliances, which survive due to their strength. These appliances receive a certain level of stress (load). The load may be defi ned as temperature, environment, mechanical load, and electric current, etc. However, if a higher level of load is applied, then their strength is unable to sustain and they break down. Let be a random variable representing the stress, and be a random variable representing the strength, then the reliability function is given as follows: ... (27) June 2020 Journal of the National Science Foundation of Sri Lanka 48 (2) Similarly, if is a random variable representing the stress, and is a random variable representing the strength, then the reliability function is ... (28) It is clear that the stress-strength model does not depend on the baseline function .

Joint hazard rate function and its marginal functions
Let be a two dimensional random variable with joint PDF and joint reliability function . Basu (1971) Figure 1 shows the joint PDF, BHR function and the BRHR function of the BGoLLD for the parameters .

Bivariate Gompertz-Frechet distribution (BGoFD)
Let , for , be the CDF of the Frechet distribution, then the joint survival of the BGoFD is given by ...(34) Figure 2 shows the joint PDF, BHR function and the BRHR function of the BGoFD for the parameters .

Bivariate Gompertz-Weibull distribution (BGoWD)
Let , for , be the CDF of the Weibull distribution, then the joint survival of the BGoWD is given by ...(35) Figure 3 shows the joint PDF, BHR function and the BRHR function of the BGoWD for the parameters and .
From Figures 1, 2 and 3, we note the BGo-H family presents diff erent shapes of the joint PDF, BHR function and the BRHR function for diff erent baseline

Maximum likelihood estimation (MLE)
In this section, we estimate the unknown parameters of the BGo-H family using the maximum likelihood method. Suppose that is a sample of size from the BGo-H family. We use the following notation , , , Based on the observations, the likelihood function of this sample is

...(36)
June 2020 Journal of the National Science Foundation of Sri Lanka 48 (2) Substituting equation (6) into equation (36), the loglikelihood function can be written as ... (37) The fi rst partial derivatives of equation (37)  ... (41) and ... (42) where means the derivative of the function A(.) with respect to By equating the equations (38 -42) by zeros, we get the non-linear normal equations. So, the solution has to be obtained numerically.

Simulation results
In this section, the MLE method is used to estimate the parameters and of the BGoLLD. The population parameters are generated using software package. The sampling distributions are obtained for diff erent sample sizes n = [50; 250; 600; 1000] from N = 1000 replications. This study presents an assessment of the properties of the MLE for the parameters in terms of variance (Var) and mean square error (MSE). The following algorithm shows how to generate data from the BGoLLD: 1. Generate and from .

Obtain
June 2020 Journal of the National Science Foundation of Sri Lanka 48(2)  From Table 1, we note that the Var and the MSE are reduced as the sample size is increased. These results indicate that the BGoLLD works well under the situation where no censoring occurs, and the MLE is a good method to estimate the model parameters.

RESULTS AND DISCUSSION: REAL DATA ANALYSIS
This data represents football (soccer) data of the UEFA Champion's League (Meintanis, 2007 To make this comparison, we will use the log-likelihood values (L), Bayesian information criterion (BIC), Akaike information criterion (AIC), correct Akaike information criterion (CAIC) and Hannan-Quinn information criterion (HQIC). Figure 4 shows the data representation.      Table 2 reports -L, Kolmogorov-Smirnov (K-S) distance and p values for and . Based on the p values, it is clear that the GoLLD fi ts the data for the marginals. Figures 5 and 6 show the estimated CDF and PP plots for real data, which support our results in Table 2. Now, we fi t the BGoLLD on this data. Tables 3 and 4 list the MLEs, L, AIC, CAIC, HQIC and BIC values for the competitive models based on football data.
From Table 4, it is clear that the BGoLLD provides a better fi t than the other competitive models, because it has the smallest value among -L, AIC, CAIC, HQIC and BIC.

CONCLUSIONS
In this paper, we have presented a new fl exible bivariate generator of distributions, in the so-called bivariate Gompertz-H (BGo-H) family, whose marginal distributions are Gompertz-H families. The joint CDF and joint PDF of the BGo-H family have simple forms; therefore, this new model can be easily used in practice for modelling bivariate data restricted in the interval . Some statistical and mathematical properties of the new family have been studied. The simulation results have indicated that the MLE works quite satisfactorily and it can be used to compute the model parameters. Also, we have analysed a real dataset and showed through goodness-of-fi t tests that the proposed family can be used for modelling the data considered herein.
A multivariate extension of the Gompertz-H family is presented as conclusion. Assume be independent random variables with , such that Defi ne . Hence, the joint survival function of is given by for , where Clearly, the BGo-H family arises from this multivariate Gompertz-H family by taking n = 2. In the future, we will discuss in detail the multivariate extension of the Gompertz-H family, because it has many applications in lifetime analysis, environmental, economics, engineering and medical sciences.