IT WOULD BE CLEARLY BENEFICIAL FOR YOU BY USING THE RMD FILES IN THE GITHUB DIRECTORY FOR FURTHER EXPLANATION OR UNDERSTANDING OF THE R CODE FOR THE RESULTS OBTAINED IN THE VIGNETTES.

Fitting Alternate Binomial or Binomial Mixture distribution is the most crucial part of while handling Binomial Outcome Data. Finding the most suitable distribution which is similar to the given data is very important and essential. In order to compare distributions after fitted we can choose several measurements. They are namely

  • Comparing actual frequency and estimated frequency.
  • Using Chi-squared Test statistic and comparing p-values.
  • Comparing actual variance with estimated variances of distributions.
  • Comparing Negative Log Likelihood values of distributions.
  • Comparing AIC values of distributions.

The functions given below are used to fit respective distributions when BOD and estimated parameters are given.

  • fitBin - fitting the Binomial distribution.
  • fitTriBin- fitting the Triangular Binomial distribution.
  • fitBetaBin - fitting the Beta-Binomial distribution.
  • fitKumBin - fitting the Kumaraswamy Binomial distribution.
  • fitGHGBB - fitting the Gaussian Hyper-geometric Generalized Beta-Binomial distribution.
  • fitMcGBB - fitting the McDonald Generalized Beta-Binomial distribution.
  • fitGammaBin - fitting Gamma Binomial distribution.
  • fitGrassiaIIBin - fitting Grassia II Binomial distribution.
  • fitAddBin - fitting the Additive Binomial distribution.
  • fitBetaCorrBin - fitting the Beta Correlated Binomial distribution.
  • fitCOMPBin - fitting the COM Poisson Binomial distribution.
  • fitCorrBin - fitting the Correlated Binomial distribution.
  • fitMultiBin - fitting the Multiplicative Binomial distribution.
  • fitLMBin - fitting the Lovinson Multiplicative Binomial distribution.

Fitting Alternate Binomial Distributions

All six Alternate Binomial distributions will be fitted to the Alcohol data week 2 and their expected frequencies will be plotted with the actual frequency values. This is plot can be used to identify which distribution suits best for the Alcohol data week 2.

Alcohol_data
##   Days week1 week2
## 1    0    47    42
## 2    1    54    47
## 3    2    43    54
## 4    3    40    40
## 5    4    40    49
## 6    5    41    40
## 7    6    39    43
## 8    7    95    84
BinRanVar <- Alcohol_data$Days
ActFreq <- Alcohol_data$week2

# Fitting Binomial Distribution
BinFreq <- fitBin(BinRanVar,ActFreq)
## Chi-squared approximation may be doubtful because expected frequency is less than 5
# printing the results of fitting Binomial distribution
print(BinFreq)
## Call: 
## fitBin(x = BinRanVar, obs.freq = ActFreq)
## 
## Chi-squared test for Binomial Distribution 
##  
##       Observed Frequency :  42 47 54 40 49 40 43 84 
##  
##       expected Frequency :  1.66 13.79 49.19 97.48 115.89 82.67 32.76 5.56 
##  
##       estimated probability value : 0.5431436 
##  
##       X-squared : 2265.111   ,df : 6   ,p-value : 0

Additive Binomial Distribution

# Estimating and fitting Additive Binomial distribution
Para_AddBin <- EstMLEAddBin(BinRanVar,ActFreq)

# printing the coefficients and using them
coef(Para_AddBin)
##       p       alpha  
##  0.5466 0.216612
AddBin_p <- Para_AddBin$p
AddBin_alpha <- Para_AddBin$alpha

# Fitting Additive Binomial Distribution
AddBinFreq <- fitAddBin(BinRanVar, ActFreq, AddBin_p, AddBin_alpha)

# printing the results of fitting Additive Binomial Distribution
print(AddBinFreq)
## Call: 
## fitAddBin(x = BinRanVar, obs.freq = ActFreq, p = AddBin_p, alpha = AddBin_alpha)
## 
## Chi-squared test for Additive Binomial Distribution 
##  
##       Observed Frequency :  42 47 54 40 49 40 43 84 
##  
##       expected Frequency :  10.19 47.98 77.94 48.82 30.46 74.95 80.9 27.76 
##  
##       estimated p value : 0.5466  ,estimated alpha parameter : 0.216612 
##  
##       X-squared : 267.5441   ,df : 5   ,p-value : 0

Beta-Correlated Binomial Distribution

# Estimating and fitting Beta Correlated Binoial Distribution
Para_BetaCorrBin <- EstMLEBetaCorrBin(x=BinRanVar, freq=ActFreq,
                                    cov=0.001,a=10,b=10)

# printing the coefficients and using them
coef(Para_BetaCorrBin)
##        cov          a          b 
## 0.06116679 3.30332560 2.70293758
BetaCorrBin_cov <- coef(Para_BetaCorrBin)[1]
BetaCorrBin_a <- coef(Para_BetaCorrBin)[2]
BetaCorrBin_b <- coef(Para_BetaCorrBin)[3]

# Fitting Beta-Correlated Binomial Distribution
BetaCorrBinFreq <- fitBetaCorrBin(BinRanVar,ActFreq,BetaCorrBin_cov,
                                BetaCorrBin_a,BetaCorrBin_b)

# printing the results of fitting Beta-Correlated Binomial Distribution
print(BetaCorrBinFreq)
## Call: 
## fitBetaCorrBin(x = BinRanVar, obs.freq = ActFreq, cov = BetaCorrBin_cov, 
##     a = BetaCorrBin_a, b = BetaCorrBin_b)
## 
## Chi-squared test for Beta-Correlated Binomial Distribution 
##  
##       Observed Frequency :  42 47 54 40 49 40 43 84 
##  
##       expected Frequency :  42.37 45.97 47.05 46.75 44.47 41.68 45.46 85.24 
##  
##       estimated covariance value: 0.06116679 
##  
##       estimated a parameter : 3.303326  , estimated b parameter : 2.702938 
##  
##       X-squared : 2.7079   ,df : 4   ,p-value : 0.6078

COM-Poisson Binomial Distribution

# Estimating and fitting COM Poisson Binomial Distribution
Para_COMPBin <- EstMLECOMPBin(x=BinRanVar, freq=ActFreq,
                            v=12.1,p=0.9)

# printing the coefficients and using them
coef(Para_COMPBin)
##           p           v 
##  0.51295510 -0.09267367
COMPBin_p <- coef(Para_COMPBin)[1]
COMPBin_v <- coef(Para_COMPBin)[2]

# Fitting COM-Poisson Binomial Distribution
COMPBinFreq <- fitCOMPBin(BinRanVar, ActFreq,COMPBin_p,COMPBin_v)

# printing the results of fitting COM-Poisson Binomial Distribution
print(COMPBinFreq)
## Call: 
## fitCOMPBin(x = BinRanVar, obs.freq = ActFreq, p = COMPBin_p, 
##     v = COMPBin_v)
## 
## Chi-squared test for COM Poisson Binomial Distribution 
##  
##       Observed Frequency :  42 47 54 40 49 40 43 84 
##  
##       expected Frequency :  49.9 43.89 41.75 41.93 44.16 48.77 56.87 71.73 
##  
##       estimated p value : 0.5129551  ,estimated v parameter : -0.09267367 
##  
##       X-squared : 12.7434   ,df : 5   ,p-value : 0.0259

Correlated Binomial Distribution

# Estimating and fitting Correlated Binomial Distribution
Para_CorrBin <- EstMLECorrBin(x=BinRanVar, freq=ActFreq,
                            cov=0.0021,p=0.19)

# printing the coefficients and using them
coef(Para_CorrBin)
##         p       cov 
## 0.5466030 0.0536791
CorrBin_p <- coef(Para_CorrBin)[1]
CorrBin_cov <- coef(Para_CorrBin)[2]

# Fitting Correlated Binomial Distribution
CorrBinFreq <- fitCorrBin(BinRanVar, ActFreq,CorrBin_p,CorrBin_cov)

# printing the results of fitting Correlated Binomial Distribution
print(CorrBinFreq)
## Call: 
## fitCorrBin(x = BinRanVar, obs.freq = ActFreq, p = CorrBin_p, 
##     cov = CorrBin_cov)
## 
## Chi-squared test for Correlated Binomial Distribution 
##  
##       Observed Frequency :  42 47 54 40 49 40 43 84 
##  
##       expected Frequency :  10.19 47.97 77.94 48.82 30.46 74.95 80.9 27.76 
##  
##       estimated p value : 0.546603  ,estimated cov value : 0.0536791 
##  
##       X-squared : 267.5437   ,df : 5   ,p-value : 0

Multiplicative Binomial Distribution

# Estimating and fitting Multiplicative Binomial Distribution
Para_MultiBin <- EstMLEMultiBin(x=BinRanVar, freq=ActFreq,
                              theta=21,p=0.19)

# printing the coefficients and using them
coef(Para_MultiBin)
##         p     theta 
## 0.5129813 0.7220349
MultiBin_p <- coef(Para_MultiBin)[1]
MultiBin_theta <- coef(Para_MultiBin)[2]

# Fitting Multiplicative Binomial Distribution
MultiBinFreq <- fitMultiBin(BinRanVar, ActFreq,MultiBin_p,MultiBin_theta)

# printing the results of fitting Multiplicative Binomial Distribution
print(MultiBinFreq)
## Call: 
## fitMultiBin(x = BinRanVar, obs.freq = ActFreq, p = MultiBin_p, 
##     theta = MultiBin_theta)
## 
## Chi-squared test for Multiplicative Binomial Distribution 
##  
##       Observed Frequency :  42 47 54 40 49 40 43 84 
##  
##       expected Frequency :  47.11 49.22 42.27 38.69 40.75 49.4 63.81 67.76 
##  
##       estimated p value : 0.5129813  ,estimated theta parameter : 0.7220349 
##  
##       X-squared : 18.0917   ,df : 5   ,p-value : 0.0028

Lovinson Multiplicative Binomial Distribution

# Estimating and fitting Lovinson Multiplicative Binomial Distribution
Para_LMBin <- EstMLELMBin(x=BinRanVar, freq=ActFreq,
                        phi=21,p=0.19)

# printing the coefficients and using them
coef(Para_LMBin)
##         p       phi 
## 0.5129813 0.7220349
LMBin_p <- coef(Para_LMBin)[1]
LMBin_phi <- coef(Para_LMBin)[2]

# Fitting Lovinson Multiplicative Binomial Distribution
LMBinFreq <- fitLMBin(BinRanVar, ActFreq,LMBin_p,LMBin_phi)

# printing the results of fitting Multiplicative Binomial Distribution
print(LMBinFreq)
## Call: 
## fitLMBin(x = BinRanVar, obs.freq = ActFreq, p = LMBin_p, phi = LMBin_phi)
## 
## Chi-squared test for Lovinson Multiplicative Binomial Distribution 
##  
##       Observed Frequency :  42 47 54 40 49 40 43 84 
##  
##       expected Frequency :  47.11 49.22 42.27 38.69 40.75 49.4 63.81 67.76 
##  
##       estimated p value : 0.5129813  ,estimated phi parameter : 0.7220349 
##  
##       X-squared : 18.0917   ,df : 5   ,p-value : 0.0028

Conclusion

It is clearly visible that Additive and Correlated Binomial distributions behave very similarly, both generate same frequency values. Although they are close to actual frequencies. Comparing Binomial distribution generated expected frequencies with actual frequencies will lead to see that there is very much difference and Binomial distribution does not suit this Alcohol data of week 2. Multiplicative and COM Poisson Binomial distributions show close values to actual frequencies. Multiplicative and Lovinson Multiplicative distributions are behaving similarly as a pair.

Finally, the only distribution left is Beta Correlated Binomial distribution which shows more closeness to actual frequencies. Therefore it is clear that most suitable distribution for alcohol data week 2 is Beta Correlated Binomial distribution, second choice is Multiplicative and COM Poisson Binomial distributions and final choice is Correlated and Additive Binomial distributions.

Fitting Binomial Mixture Distributions

In the eight BMD distributions except Uniform Binomial distribution others can be used for fitting the Alcohol data week 2. Here also as above a plot was generated to compare estimated frequencies with actual frequency.

Triangular Binomial Distribution

# Estimating and fitting Triangular Binomial distribution
Para_TriBin <- EstMLETriBin(BinRanVar,ActFreq)

# printing the coefficients and using them
coef(Para_TriBin)
##  mode 
##  0.944444
TriBin_c <- Para_TriBin$mode

# Fitting Triangular Binomial Distribution
TriBinFreq <- fitTriBin(BinRanVar, ActFreq, TriBin_c)

# printing the results of fitting Triangular Binomial Distribution
print(TriBinFreq)
## Call: 
## fitTriBin(x = BinRanVar, obs.freq = ActFreq, mode = TriBin_c)
## 
## Chi-squared test for Triangular Binomial Distribution 
##  
##       Observed Frequency :  42 47 54 40 49 40 43 84 
##  
##       expected Frequency :  11.74 23.47 35.21 46.94 58.66 70.2 79.57 73.21 
##  
##       estimated Mode value: 0.944444 
##  
##       X-squared : 145.6196   ,df : 6   ,p-value : 0 
##  
##       over dispersion : 0.2308269

Beta-Binomial Distribution

# Estimating and fitting Beta Correlated Binoial Distribution
Para_BetaBin <- EstMLEBetaBin(x=BinRanVar, freq=ActFreq,
                            a=10,b=10)
# printing the coefficients and using them
coef(Para_BetaBin)
##         a         b 
## 0.8575339 0.7007620
BetaBin_a <- coef(Para_BetaBin)[1]
BetaBin_b <- coef(Para_BetaBin)[2]

# Fitting Beta-Binomial Distribution
BetaBinFreq <- fitBetaBin(BinRanVar, ActFreq,BetaBin_a,BetaBin_b)

# printing the results of fitting Beta-Binomial Distribution
print(BetaBinFreq)
## Call: 
## fitBetaBin(x = BinRanVar, obs.freq = ActFreq, a = BetaBin_a, 
##     b = BetaBin_b)
## 
## Chi-squared test for Beta-Binomial Distribution 
##  
##           Observed Frequency :  42 47 54 40 49 40 43 84 
##  
##           expected Frequency :  47.91 42.92 41.95 42.5 44.3 47.81 54.89 76.73 
##  
##           estimated a parameter : 0.8575339   ,estimated b parameter : 0.700762 
##  
##           X-squared : 9.7641   ,df : 5   ,p-value : 0.0822 
##  
##           over dispersion : 0.3908852

Kumaraswamy Binomial Distribution

# Estimating and fitting Kumaraswamy Binomial Distribution
Para_KumBin <- EstMLEKumBin(x=BinRanVar, freq=ActFreq,
                          a=12.1,b=0.9,it=10000)

# printing the coefficients and using them
coef(Para_KumBin)
##            a            b           it 
## 8.641598e-01 7.173736e-01 1.000001e+04
KumBin_a <- coef(Para_KumBin)[1]
KumBin_b <- coef(Para_KumBin)[2]
KumBin_it <- coef(Para_KumBin)[3]

# Fitting Kumaraswamy Binomial Distribution
KumBinFreq <- fitKumBin(BinRanVar, ActFreq,KumBin_a,KumBin_b,KumBin_it*10)

# printing the results of fitting Kumaraswamy Binomial Distribution
print(KumBinFreq)
## Call: 
## fitKumBin(x = BinRanVar, obs.freq = ActFreq, a = KumBin_a, b = KumBin_b, 
##     it = KumBin_it * 10)
## 
## Chi-squared test for Kumaraswamy Binomial Distribution 
##  
##       Observed Frequency :  42 47 54 40 49 40 43 84 
##  
##       expected Frequency :  47.34 42.94 42.23 42.92 44.78 48.25 55.07 75.44 
##  
##       estimated a parameter : 0.8641598   ,estimated b parameter : 0.7173736 ,
##  
##       estimated it value : 100000.1 
##  
##       X-squared : 9.8904   ,df : 5   ,p-value : 0.0784 
##  
##       over dispersion : 0.3864332

GHGBB Distribution

# Estimating and fitting GHGBB Distribution
Para_GHGBB <- EstMLEGHGBB(x=BinRanVar, freq=ActFreq,
                        a=0.0021,b=0.19,c=0.3)

# printing the coefficients and using them
coef(Para_GHGBB)
##         a         b         c 
## 1.6310680 0.3913700 0.6782968
GHGBB_a <- coef(Para_GHGBB)[1]
GHGBB_b <- coef(Para_GHGBB)[2]
GHGBB_c <- coef(Para_GHGBB)[3]

# Fitting GHGBB Distribution
GHGBBFreq <- fitGHGBB(BinRanVar, ActFreq,GHGBB_a,GHGBB_b,GHGBB_c)

# printing the results of fitting GHGBB Distribution
print(GHGBBFreq)
## Call: 
## fitGHGBB(x = BinRanVar, obs.freq = ActFreq, a = GHGBB_a, b = GHGBB_b, 
##     c = GHGBB_c)
## 
## Chi-squared test for Gaussian Hypergeometric Generalized Beta-Binomial Distribution 
##  
##       Observed Frequency :  42 47 54 40 49 40 43 84 
##  
##       expected Frequency :  41.18 49.9 49.55 46.32 42.91 41.12 44.31 83.71 
##  
##       estimated a parameter : 1.631068   ,estimated b parameter : 0.39137 ,
##  
##       estimated c parameter : 0.6782968 
##  
##       X-squared : 2.3814   ,df : 4   ,p-value : 0.666 
##  
##       over dispersion : 0.3885284

McGBB Distribution

# Estimating and fitting McGBB Distribution
Para_McGBB <- EstMLEMcGBB(x=BinRanVar, freq=ActFreq,
                        a=21,b=0.19,c=0.1)

# printing the coefficients and using them
coef(Para_McGBB)
##           a           b           c 
## 36.38181977  0.71809165  0.02127586
McGBB_a <- coef(Para_McGBB)[1]
McGBB_b <- coef(Para_McGBB)[2]
McGBB_c <- coef(Para_McGBB)[3]

# Fitting McGBB Distribution
McGBBFreq <- fitMcGBB(BinRanVar, ActFreq,McGBB_a,McGBB_b,McGBB_c)

# printing the results of fitting McGBB Distribution
print(McGBBFreq)
## Call: 
## fitMcGBB(x = BinRanVar, obs.freq = ActFreq, a = McGBB_a, b = McGBB_b, 
##     c = McGBB_c)
## 
## Chi-squared test for Mc-Donald Generalized Beta-Binomial Distribution 
##  
##       Observed Frequency :  42 47 54 40 49 40 43 84 
##  
##       expected Frequency :  47.83 42.57 41.85 42.64 44.64 48.26 55.27 75.94 
##  
##       estimated a parameter : 36.38182   ,estimated b parameter : 0.7180916 ,
##  
##       estimated c parameter : 0.02127586 
##  
##       X-squared : 10.2815   ,df : 4   ,p-value : 0.0359 
##  
##       over dispersion : 0.3885724

Gamma Binomial Distribution

# Estimating and fitting Gamma Binoial Distribution
Para_GammaBin <- EstMLEGammaBin(x=BinRanVar, freq=ActFreq,
                              c=10,l=10)

# printing the coefficients and using them
coef(Para_GammaBin)
##         c         l 
## 0.7701314 0.7177471
GammaBin_c <- coef(Para_GammaBin)[1]
GammaBin_l <- coef(Para_GammaBin)[2]

# Fitting Gamma Binomial Distribution
GammaBinFreq <- fitGammaBin(BinRanVar, ActFreq,GammaBin_c,GammaBin_l)

# printing the results of fitting Beta-Binomial Distribution
print(GammaBinFreq)
## Call: 
## fitGammaBin(x = BinRanVar, obs.freq = ActFreq, c = GammaBin_c, 
##     l = GammaBin_l)
## 
## Chi-squared test for Gamma Binomial Distribution 
##  
##       Observed Frequency :  42 47 54 40 49 40 43 84 
##  
##       expected Frequency :  47.89 42.59 41.85 42.63 44.62 48.24 55.25 75.94 
##  
##       estimated c parameter : 0.7701314   ,estimated l parameter : 0.7177471  
##  
##       X-squared : 10.2797   ,df : 5   ,p-value : 0.0677 
##  
##       over dispersion : 0.3887571

Grassia II Binomial Distribution

# Estimating and fitting Grassia II Binoial Distribution
Para_GrassiaIIBin <- EstMLEGrassiaIIBin(x=BinRanVar, freq=ActFreq,
                                      a=10,b=10)
# printing the coefficients and using them
coef(Para_GrassiaIIBin)
##         a         b 
## 0.8562267 1.5388073
GrassiaIIBin_a <- coef(Para_GrassiaIIBin)[1]
GrassiaIIBin_b <- coef(Para_GrassiaIIBin)[2]

# Fitting Grassia II Binomial Distribution
GrassiaIIBinFreq <- fitGrassiaIIBin(BinRanVar, ActFreq,GrassiaIIBin_a,GrassiaIIBin_b)

# printing the results of fitting Grassia II Binomial Distribution
print(GrassiaIIBinFreq)
## Call: 
## fitGrassiaIIBin(x = BinRanVar, obs.freq = ActFreq, a = GrassiaIIBin_a, 
##     b = GrassiaIIBin_b)
## 
## Chi-squared test for Grassia II Binomial Distribution 
##  
##       Observed Frequency :  42 47 54 40 49 40 43 84 
##  
##       expected Frequency :  48.32 43.1 41.99 42.4 44.06 47.42 54.42 77.29 
##  
##       estimated a parameter : 0.8562267   ,estimated b parameter : 1.538807  
##  
##       X-squared : 9.4444   ,df : 5   ,p-value : 0.0926 
##  
##       over dispersion : 0.2901217

Conclusion

Estimated frequencies from Triangular distribution are far more better than estimated frequencies from Binomial distribution. It is clear that Beta-Binomial, Kumaraswamy Binomial, Gamma Binomial and Grassia II Binomial distributions behave identically for the alcohol data of week 2. Also McDonald Generalized Beta-Binomial distribution too behave equally for the alcohol data of week 2.

Finally, Gaussian Hyper-geometric Generalized Beta-Binomial distribution is best suited and generates more accurate frequencies. Therefore, first choice is GHGBB distribution and second choice would be McGBB distribution.