Skip to contents

Function to simulate big data under linear, logistic and Poisson regression for the model robust scenario through a set of models. Covariate data X is through Normal or Uniform distribution for linear regression. Covariate data X is through Exponential or Normal or Uniform distribution for logistic regression. Covariate data X is through Normal or Uniform distribution for Poisson regression.

Usage

GenModelRobustGLMdata(Dist,Dist_Par,No_Of_Var,Beta,N,All_Models,family)

Arguments

Dist

a character value for the distribution "Normal" or "Uniform

Dist_Par

a list of parameters for the distribution that would generate data for covariate X

No_Of_Var

number of variables

Beta

a vector for the model parameters, including the intercept

N

the big data size

All_Models

a list that contains the possible models

family

a character vector for "linear", "logistic" and "poisson" regression from Generalised Linear Models

Value

The output of GenModelRobustGLMdata gives a list of

Basic a list of outputs based on the inputs and Beta Estimates for all models

Complete_Data a matrix for Y,X and X^2

Details

Big data for the Generalised Linear Models are generated by the "linear", "logistic" and "poisson" regression types.

We have limited the covariate data generation for linear regression through normal and uniform distribution, logistic regression through exponential, normal and uniform and Poisson regression through normal and uniform distribution.

For a given real model data is generated and then this data is modelled by All_Models.

Examples

Dist<-"Normal"; Dist_Par<-list(Mean=0,Variance=1,Error_Variance=0.5)
No_Of_Var<-2; Beta<-c(-1,2,1,2); N<-10000
All_Models<-list(Real_Model=c("X0","X1","X2","X1^2"),
                 Assumed_Model_1=c("X0","X1","X2"),
                 Assumed_Model_2=c("X0","X1","X2","X2^2"),
                 Assumed_Model_3=c("X0","X1","X2","X1^2","X2^2"))
family<-"linear"

Results<-GenModelRobustGLMdata(Dist,Dist_Par,No_Of_Var,Beta,N,All_Models,family)

Dist<-"Normal"; Dist_Par<-list(Mean=0,Variance=1)
No_Of_Var<-2; Beta<-c(-1,2,1,2); N<-10000
All_Models<-list(Real_Model=c("X0","X1","X2","X1^2"),
                 Assumed_Model_1=c("X0","X1","X2"),
                 Assumed_Model_2=c("X0","X1","X2","X2^2"),
                 Assumed_Model_3=c("X0","X1","X2","X1^2","X2^2"))
family = "logistic"

Results<-GenModelRobustGLMdata(Dist,Dist_Par,No_Of_Var,Beta,N,All_Models,family)
#> Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
#> Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred

Dist<-"Normal";
No_Of_Var<-2; Beta<-c(-1,2,1,2); N<-10000
All_Models<-list(Real_Model=c("X0","X1","X2","X1^2"),
                 Assumed_Model_1=c("X0","X1","X2"),
                 Assumed_Model_2=c("X0","X1","X2","X2^2"),
                 Assumed_Model_3=c("X0","X1","X2","X1^2","X2^2"))
family = "poisson"

Results<-GenModelRobustGLMdata(Dist,Dist_Par=NULL,No_Of_Var,Beta,N,All_Models,family)
#> Warning: glm.fit: algorithm did not converge
#> Warning: glm.fit: fitted rates numerically 0 occurred
#> Warning: glm.fit: fitted rates numerically 0 occurred
#> Warning: glm.fit: algorithm did not converge