Function to simulate big data under linear, logistic and Poisson regression for sampling. Covariate data X is through Normal, Multivariate Normal or Uniform distribution for linear regression. Covariate data X is through Exponential, Normal, Multivariate Normal or Uniform distribution for logistic regression. Covariate data X is through Normal or Uniform distribution for Poisson regression.
Arguments
- Dist
a character value for the distribution "Normal", "MVNormal", "Uniform or "Exponential"
- Dist_Par
a list of parameters for the distribution that would generate data for covariate X
- No_Of_Var
number of variables
- Beta
a vector for the model parameters, including the intercept
- N
the big data size
- family
a character vector for "linear", "logistic" and "poisson" regression from Generalised Linear Models
Details
Big data for the Generalised Linear Models are generated by the "linear", "logistic" and "poisson" regression types.
We have limited the covariate data generation for linear regression through normal, multivariate normal and uniform distribution, logistic regression through exponential, normal, multivariate normal and uniform distribution Poisson regression through normal and uniform distribution.
References
Lee Y, Nelder JA (1996). “Hierarchical generalized linear models.” Journal of the Royal Statistical Society Series B: Statistical Methodology, 58(4), 619--656.
Examples
No_Of_Var<-2; Beta<-c(-1,2,1); N<-5000;
# Dist<-"Normal"; Dist_Par<-list(Mean=0,Variance=1,Error_Variance=0.5)
Dist<-"MVNormal";
Dist_Par<-list(Mean=rep(0,No_Of_Var),Variance=diag(rep(2,No_Of_Var)),Error_Variance=0.5)
# Dist<-"Uniform"; Dist_Par<-list(Min=0,Max=1)
Family<-"linear"
Results<-GenGLMdata(Dist,Dist_Par,No_Of_Var,Beta,N,Family)
Dist<-"Normal"; Dist_Par<-list(Mean=0,Variance=1);
# Dist<-"MVNormal"; Dist_Par<-list(Mean=rep(0,No_Of_Var),Variance=diag(rep(2,No_Of_Var)))
# Dist<-"Exponential"; Dist_Par<-list(Rate=3)
# Dist<-"Uniform"; Dist_Par<-list(Min=0,Max=1)
Family<-"logistic"
Results<-GenGLMdata(Dist,Dist_Par,No_Of_Var,Beta,N,Family)
# Dist<-"Normal";
Dist<-"Uniform"; Family<-"poisson"
Results<-GenGLMdata(Dist,NULL,No_Of_Var,Beta,N,Family)