Skip to contents

Using a three step algorithm to generate overdispersed binomial outcome data. When the number of frequencies, binomial random variable, probability of success and overdispersion are given.

Usage

GenerateBOD(N,n,pi,rho)

Arguments

N

single value for number of total frequencies

n

single value for binomial random variable

pi

single value for probability of success

rho

single value for overdispersion parameter

Value

The output of GenerateBOD gives a vector of overdispersed binomial random variables

Details

The generated binomial random variables are overdispersed based on \(rho\) for the probability of success \(pi\).

Step 1: Solve the following equation for a given \(n,pi,rho\), $$phi(z(pi),z(pi),delta)=pi(1-pi)rho + pi^2,$$

For \(delta\) where \(phi(z(pi),z(pi),delta)\) is the cumulative distribution function of the standard bivariate normal random variable with correlation coefficient \(delta\), and \(z(pi)\) denotes the \(pi^{th}\) quantile of the standard normal distribution.

Step 2: Generate $n$-dimensional multivariate normal random variables, \(Z_i=(Z_{i1},Z_{i2},ldots,Z_{in})^T\) with mean \(0\) and constant correlation matrix \(Sigma_i\) for \(i=1,2,\ldots,N,\) where the elements of \((Sigma_i)_{lm}\) are \(delta\) for \(l \ne m\).

Step 3: Now for each \(j=1,2,\ldots,n\) define \(X_{ij} = 1;\) if \(Z_{ij} < z(\pi)\), or \(X_{ij} = 0;\) otherwise. Then, it can be showed that the random variable \(Y_i=\sum_{j=1}^{n} X_{ij}\) is overdispersed relative to the Binomial distribution.

NOTE : If input parameters are not in given domain conditions necessary error messages will be provided to go further.

References

Manoj C, Wijekoon P, Yapa RD (2013). “The McDonald generalized beta-binomial distribution: A new binomial mixture distribution and simulation based comparison with its nested distributions in handling overdispersion.” International journal of statistics and probability, 2(2), 24.

Examples

N <- 500    # Number of observations
n <- 10      # Dimension of multivariate normal random variables
pi <- 0.5   # Probability threshold
rho <- 0.1  # Dispersion parameter

# Generate overdispersed binomial variables
New_overdispersed_data <- GenerateBOD(N, n, pi, rho)
table(New_overdispersed_data)
#> New_overdispersed_data
#>  0  1  2  3  4  5  6  7  8  9 10 
#>  6 18 47 62 68 95 77 67 32 21  7