vignettes/BMDs_and_ABDs_fitxxxBin.Rmd
BMDs_and_ABDs_fitxxxBin.Rmd
IT WOULD BE CLEARLY BENEFICIAL FOR YOU BY USING THE RMD FILES IN THE GITHUB DIRECTORY FOR FURTHER EXPLANATION OR UNDERSTANDING OF THE R CODE FOR THE RESULTS OBTAINED IN THE VIGNETTES.
Fitting Alternate Binomial or Binomial Mixture distribution is the most crucial part of while handling Binomial Outcome Data. Finding the most suitable distribution which is similar to the given data is very important and essential. In order to compare distributions after fitted we can choose several measurements. They are namely
The functions given below are used to fit respective distributions when BOD and estimated parameters are given.
fitBin
- fitting the Binomial distribution.fitTriBin
- fitting the Triangular Binomial distribution.fitBetaBin
- fitting the Beta-Binomial distribution.fitKumBin
- fitting the Kumaraswamy Binomial distribution.fitGHGBB
- fitting the Gaussian Hyper-geometric Generalized Beta-Binomial distribution.fitMcGBB
- fitting the McDonald Generalized Beta-Binomial distribution.fitGammaBin
- fitting Gamma Binomial distribution.fitGrassiaIIBin
- fitting Grassia II Binomial distribution.fitAddBin
- fitting the Additive Binomial distribution.fitBetaCorrBin
- fitting the Beta Correlated Binomial distribution.fitCOMPBin
- fitting the COM Poisson Binomial distribution.fitCorrBin
- fitting the Correlated Binomial distribution.fitMultiBin
- fitting the Multiplicative Binomial distribution.fitLMBin
- fitting the Lovinson Multiplicative Binomial distribution.All six Alternate Binomial distributions will be fitted to the Alcohol data week 2 and their expected frequencies will be plotted with the actual frequency values. This is plot can be used to identify which distribution suits best for the Alcohol data week 2.
## Days week1 week2
## 1 0 47 42
## 2 1 54 47
## 3 2 43 54
## 4 3 40 40
## 5 4 40 49
## 6 5 41 40
## 7 6 39 43
## 8 7 95 84
BinRanVar <- Alcohol_data$Days
ActFreq <- Alcohol_data$week2
# Fitting Binomial Distribution
BinFreq <- fitBin(BinRanVar,ActFreq)
## Chi-squared approximation may be doubtful because expected frequency is less than 5
# printing the results of fitting Binomial distribution
print(BinFreq)
## Call:
## fitBin(x = BinRanVar, obs.freq = ActFreq)
##
## Chi-squared test for Binomial Distribution
##
## Observed Frequency : 42 47 54 40 49 40 43 84
##
## expected Frequency : 1.66 13.79 49.19 97.48 115.89 82.67 32.76 5.56
##
## estimated probability value : 0.5431436
##
## X-squared : 2265.111 ,df : 6 ,p-value : 0
# Estimating and fitting Additive Binomial distribution
Para_AddBin <- EstMLEAddBin(BinRanVar,ActFreq)
# printing the coefficients and using them
coef(Para_AddBin)
## p alpha
## 0.5466 0.216612
AddBin_p <- Para_AddBin$p
AddBin_alpha <- Para_AddBin$alpha
# Fitting Additive Binomial Distribution
AddBinFreq <- fitAddBin(BinRanVar, ActFreq, AddBin_p, AddBin_alpha)
# printing the results of fitting Additive Binomial Distribution
print(AddBinFreq)
## Call:
## fitAddBin(x = BinRanVar, obs.freq = ActFreq, p = AddBin_p, alpha = AddBin_alpha)
##
## Chi-squared test for Additive Binomial Distribution
##
## Observed Frequency : 42 47 54 40 49 40 43 84
##
## expected Frequency : 10.19 47.98 77.94 48.82 30.46 74.95 80.9 27.76
##
## estimated p value : 0.5466 ,estimated alpha parameter : 0.216612
##
## X-squared : 267.5441 ,df : 5 ,p-value : 0
# Estimating and fitting COM Poisson Binomial Distribution
Para_COMPBin <- EstMLECOMPBin(x=BinRanVar, freq=ActFreq,
v=12.1,p=0.9)
# printing the coefficients and using them
coef(Para_COMPBin)
## p v
## 0.51295510 -0.09267367
COMPBin_p <- coef(Para_COMPBin)[1]
COMPBin_v <- coef(Para_COMPBin)[2]
# Fitting COM-Poisson Binomial Distribution
COMPBinFreq <- fitCOMPBin(BinRanVar, ActFreq,COMPBin_p,COMPBin_v)
# printing the results of fitting COM-Poisson Binomial Distribution
print(COMPBinFreq)
## Call:
## fitCOMPBin(x = BinRanVar, obs.freq = ActFreq, p = COMPBin_p,
## v = COMPBin_v)
##
## Chi-squared test for COM Poisson Binomial Distribution
##
## Observed Frequency : 42 47 54 40 49 40 43 84
##
## expected Frequency : 49.9 43.89 41.75 41.93 44.16 48.77 56.87 71.73
##
## estimated p value : 0.5129551 ,estimated v parameter : -0.09267367
##
## X-squared : 12.7434 ,df : 5 ,p-value : 0.0259
# Estimating and fitting Multiplicative Binomial Distribution
Para_MultiBin <- EstMLEMultiBin(x=BinRanVar, freq=ActFreq,
theta=21,p=0.19)
# printing the coefficients and using them
coef(Para_MultiBin)
## p theta
## 0.5129813 0.7220349
MultiBin_p <- coef(Para_MultiBin)[1]
MultiBin_theta <- coef(Para_MultiBin)[2]
# Fitting Multiplicative Binomial Distribution
MultiBinFreq <- fitMultiBin(BinRanVar, ActFreq,MultiBin_p,MultiBin_theta)
# printing the results of fitting Multiplicative Binomial Distribution
print(MultiBinFreq)
## Call:
## fitMultiBin(x = BinRanVar, obs.freq = ActFreq, p = MultiBin_p,
## theta = MultiBin_theta)
##
## Chi-squared test for Multiplicative Binomial Distribution
##
## Observed Frequency : 42 47 54 40 49 40 43 84
##
## expected Frequency : 47.11 49.22 42.27 38.69 40.75 49.4 63.81 67.76
##
## estimated p value : 0.5129813 ,estimated theta parameter : 0.7220349
##
## X-squared : 18.0917 ,df : 5 ,p-value : 0.0028
# Estimating and fitting Lovinson Multiplicative Binomial Distribution
Para_LMBin <- EstMLELMBin(x=BinRanVar, freq=ActFreq,
phi=21,p=0.19)
# printing the coefficients and using them
coef(Para_LMBin)
## p phi
## 0.5129813 0.7220349
LMBin_p <- coef(Para_LMBin)[1]
LMBin_phi <- coef(Para_LMBin)[2]
# Fitting Lovinson Multiplicative Binomial Distribution
LMBinFreq <- fitLMBin(BinRanVar, ActFreq,LMBin_p,LMBin_phi)
# printing the results of fitting Multiplicative Binomial Distribution
print(LMBinFreq)
## Call:
## fitLMBin(x = BinRanVar, obs.freq = ActFreq, p = LMBin_p, phi = LMBin_phi)
##
## Chi-squared test for Lovinson Multiplicative Binomial Distribution
##
## Observed Frequency : 42 47 54 40 49 40 43 84
##
## expected Frequency : 47.11 49.22 42.27 38.69 40.75 49.4 63.81 67.76
##
## estimated p value : 0.5129813 ,estimated phi parameter : 0.7220349
##
## X-squared : 18.0917 ,df : 5 ,p-value : 0.0028
It is clearly visible that Additive and Correlated Binomial distributions behave very similarly, both generate same frequency values. Although they are close to actual frequencies. Comparing Binomial distribution generated expected frequencies with actual frequencies will lead to see that there is very much difference and Binomial distribution does not suit this Alcohol data of week 2. Multiplicative and COM Poisson Binomial distributions show close values to actual frequencies. Multiplicative and Lovinson Multiplicative distributions are behaving similarly as a pair.
Finally, the only distribution left is Beta Correlated Binomial distribution which shows more closeness to actual frequencies. Therefore it is clear that most suitable distribution for alcohol data week 2 is Beta Correlated Binomial distribution, second choice is Multiplicative and COM Poisson Binomial distributions and final choice is Correlated and Additive Binomial distributions.
In the eight BMD distributions except Uniform Binomial distribution others can be used for fitting the Alcohol data week 2. Here also as above a plot was generated to compare estimated frequencies with actual frequency.
# Estimating and fitting Triangular Binomial distribution
Para_TriBin <- EstMLETriBin(BinRanVar,ActFreq)
# printing the coefficients and using them
coef(Para_TriBin)
## mode
## 0.944444
TriBin_c <- Para_TriBin$mode
# Fitting Triangular Binomial Distribution
TriBinFreq <- fitTriBin(BinRanVar, ActFreq, TriBin_c)
# printing the results of fitting Triangular Binomial Distribution
print(TriBinFreq)
## Call:
## fitTriBin(x = BinRanVar, obs.freq = ActFreq, mode = TriBin_c)
##
## Chi-squared test for Triangular Binomial Distribution
##
## Observed Frequency : 42 47 54 40 49 40 43 84
##
## expected Frequency : 11.74 23.47 35.21 46.94 58.66 70.2 79.57 73.21
##
## estimated Mode value: 0.944444
##
## X-squared : 145.6196 ,df : 6 ,p-value : 0
##
## over dispersion : 0.2308269
# Estimating and fitting Beta Correlated Binoial Distribution
Para_BetaBin <- EstMLEBetaBin(x=BinRanVar, freq=ActFreq,
a=10,b=10)
# printing the coefficients and using them
coef(Para_BetaBin)
## a b
## 0.8575339 0.7007620
BetaBin_a <- coef(Para_BetaBin)[1]
BetaBin_b <- coef(Para_BetaBin)[2]
# Fitting Beta-Binomial Distribution
BetaBinFreq <- fitBetaBin(BinRanVar, ActFreq,BetaBin_a,BetaBin_b)
# printing the results of fitting Beta-Binomial Distribution
print(BetaBinFreq)
## Call:
## fitBetaBin(x = BinRanVar, obs.freq = ActFreq, a = BetaBin_a,
## b = BetaBin_b)
##
## Chi-squared test for Beta-Binomial Distribution
##
## Observed Frequency : 42 47 54 40 49 40 43 84
##
## expected Frequency : 47.91 42.92 41.95 42.5 44.3 47.81 54.89 76.73
##
## estimated a parameter : 0.8575339 ,estimated b parameter : 0.700762
##
## X-squared : 9.7641 ,df : 5 ,p-value : 0.0822
##
## over dispersion : 0.3908852
# Estimating and fitting Kumaraswamy Binomial Distribution
Para_KumBin <- EstMLEKumBin(x=BinRanVar, freq=ActFreq,
a=12.1,b=0.9,it=10000)
# printing the coefficients and using them
coef(Para_KumBin)
## a b it
## 8.641598e-01 7.173736e-01 1.000001e+04
KumBin_a <- coef(Para_KumBin)[1]
KumBin_b <- coef(Para_KumBin)[2]
KumBin_it <- coef(Para_KumBin)[3]
# Fitting Kumaraswamy Binomial Distribution
KumBinFreq <- fitKumBin(BinRanVar, ActFreq,KumBin_a,KumBin_b,KumBin_it*10)
# printing the results of fitting Kumaraswamy Binomial Distribution
print(KumBinFreq)
## Call:
## fitKumBin(x = BinRanVar, obs.freq = ActFreq, a = KumBin_a, b = KumBin_b,
## it = KumBin_it * 10)
##
## Chi-squared test for Kumaraswamy Binomial Distribution
##
## Observed Frequency : 42 47 54 40 49 40 43 84
##
## expected Frequency : 47.34 42.94 42.23 42.92 44.78 48.25 55.07 75.44
##
## estimated a parameter : 0.8641598 ,estimated b parameter : 0.7173736 ,
##
## estimated it value : 100000.1
##
## X-squared : 9.8904 ,df : 5 ,p-value : 0.0784
##
## over dispersion : 0.3864332
# Estimating and fitting GHGBB Distribution
Para_GHGBB <- EstMLEGHGBB(x=BinRanVar, freq=ActFreq,
a=0.0021,b=0.19,c=0.3)
# printing the coefficients and using them
coef(Para_GHGBB)
## a b c
## 1.6310680 0.3913700 0.6782968
GHGBB_a <- coef(Para_GHGBB)[1]
GHGBB_b <- coef(Para_GHGBB)[2]
GHGBB_c <- coef(Para_GHGBB)[3]
# Fitting GHGBB Distribution
GHGBBFreq <- fitGHGBB(BinRanVar, ActFreq,GHGBB_a,GHGBB_b,GHGBB_c)
# printing the results of fitting GHGBB Distribution
print(GHGBBFreq)
## Call:
## fitGHGBB(x = BinRanVar, obs.freq = ActFreq, a = GHGBB_a, b = GHGBB_b,
## c = GHGBB_c)
##
## Chi-squared test for Gaussian Hypergeometric Generalized Beta-Binomial Distribution
##
## Observed Frequency : 42 47 54 40 49 40 43 84
##
## expected Frequency : 41.18 49.9 49.55 46.32 42.91 41.12 44.31 83.71
##
## estimated a parameter : 1.631068 ,estimated b parameter : 0.39137 ,
##
## estimated c parameter : 0.6782968
##
## X-squared : 2.3814 ,df : 4 ,p-value : 0.666
##
## over dispersion : 0.3885284
# Estimating and fitting McGBB Distribution
Para_McGBB <- EstMLEMcGBB(x=BinRanVar, freq=ActFreq,
a=21,b=0.19,c=0.1)
# printing the coefficients and using them
coef(Para_McGBB)
## a b c
## 36.38181977 0.71809165 0.02127586
McGBB_a <- coef(Para_McGBB)[1]
McGBB_b <- coef(Para_McGBB)[2]
McGBB_c <- coef(Para_McGBB)[3]
# Fitting McGBB Distribution
McGBBFreq <- fitMcGBB(BinRanVar, ActFreq,McGBB_a,McGBB_b,McGBB_c)
# printing the results of fitting McGBB Distribution
print(McGBBFreq)
## Call:
## fitMcGBB(x = BinRanVar, obs.freq = ActFreq, a = McGBB_a, b = McGBB_b,
## c = McGBB_c)
##
## Chi-squared test for Mc-Donald Generalized Beta-Binomial Distribution
##
## Observed Frequency : 42 47 54 40 49 40 43 84
##
## expected Frequency : 47.83 42.57 41.85 42.64 44.64 48.26 55.27 75.94
##
## estimated a parameter : 36.38182 ,estimated b parameter : 0.7180916 ,
##
## estimated c parameter : 0.02127586
##
## X-squared : 10.2815 ,df : 4 ,p-value : 0.0359
##
## over dispersion : 0.3885724
# Estimating and fitting Gamma Binoial Distribution
Para_GammaBin <- EstMLEGammaBin(x=BinRanVar, freq=ActFreq,
c=10,l=10)
# printing the coefficients and using them
coef(Para_GammaBin)
## c l
## 0.7701314 0.7177471
GammaBin_c <- coef(Para_GammaBin)[1]
GammaBin_l <- coef(Para_GammaBin)[2]
# Fitting Gamma Binomial Distribution
GammaBinFreq <- fitGammaBin(BinRanVar, ActFreq,GammaBin_c,GammaBin_l)
# printing the results of fitting Beta-Binomial Distribution
print(GammaBinFreq)
## Call:
## fitGammaBin(x = BinRanVar, obs.freq = ActFreq, c = GammaBin_c,
## l = GammaBin_l)
##
## Chi-squared test for Gamma Binomial Distribution
##
## Observed Frequency : 42 47 54 40 49 40 43 84
##
## expected Frequency : 47.89 42.59 41.85 42.63 44.62 48.24 55.25 75.94
##
## estimated c parameter : 0.7701314 ,estimated l parameter : 0.7177471
##
## X-squared : 10.2797 ,df : 5 ,p-value : 0.0677
##
## over dispersion : 0.3887571
# Estimating and fitting Grassia II Binoial Distribution
Para_GrassiaIIBin <- EstMLEGrassiaIIBin(x=BinRanVar, freq=ActFreq,
a=10,b=10)
# printing the coefficients and using them
coef(Para_GrassiaIIBin)
## a b
## 0.8562267 1.5388073
GrassiaIIBin_a <- coef(Para_GrassiaIIBin)[1]
GrassiaIIBin_b <- coef(Para_GrassiaIIBin)[2]
# Fitting Grassia II Binomial Distribution
GrassiaIIBinFreq <- fitGrassiaIIBin(BinRanVar, ActFreq,GrassiaIIBin_a,GrassiaIIBin_b)
# printing the results of fitting Grassia II Binomial Distribution
print(GrassiaIIBinFreq)
## Call:
## fitGrassiaIIBin(x = BinRanVar, obs.freq = ActFreq, a = GrassiaIIBin_a,
## b = GrassiaIIBin_b)
##
## Chi-squared test for Grassia II Binomial Distribution
##
## Observed Frequency : 42 47 54 40 49 40 43 84
##
## expected Frequency : 48.32 43.1 41.99 42.4 44.06 47.42 54.42 77.29
##
## estimated a parameter : 0.8562267 ,estimated b parameter : 1.538807
##
## X-squared : 9.4444 ,df : 5 ,p-value : 0.0926
##
## over dispersion : 0.2901217
Estimated frequencies from Triangular distribution are far more better than estimated frequencies from Binomial distribution. It is clear that Beta-Binomial, Kumaraswamy Binomial, Gamma Binomial and Grassia II Binomial distributions behave identically for the alcohol data of week 2. Also McDonald Generalized Beta-Binomial distribution too behave equally for the alcohol data of week 2.
Finally, Gaussian Hyper-geometric Generalized Beta-Binomial distribution is best suited and generates more accurate frequencies. Therefore, first choice is GHGBB distribution and second choice would be McGBB distribution.