Skip to contents

Big Data

Real world big data

Bike_sharing
Bike sharing data
Electric_consumption
Electric consumption data
Skin_segmentation
Skin segmentation data

Generate Big Data

Simulating big data under different conditions

GenGLMdata()
Generate data for Generalised Linear Models
GenModelRobustGLMdata()
Generate data for Generalised Linear Models under model robust scenario
GenModelMissGLMdata()
Generate data for Generalised Linear Models under model misspecification scenario

Sampling

Model based Sampling Algorithms

AoptimalGauLMSub()
A-optimality criteria based subsampling under Gaussian Linear Models
LCCsampling()
Local case control sampling for logistic regression
ALoptimalGLMSub()
A- and L-optimality criteria based subsampling under Generalised Linear Models
LeverageSampling()
Basic and shrinkage leverage sampling for Generalised Linear Models
AoptimalMCGLMSub()
A-optimality criteria based sampling under measurement constraints for Generalised Linear Models

Model Robust Subsampling Algorithm

Subsampling algorithms when a set of models is considered to describe the big data

modelRobustLinSub()
Model robust optimal subsampling for A- and L- optimality criteria under linear regression
modelRobustLogSub()
Model robust optimal subsampling for A- and L- optimality criteria under logistic regression
modelRobustPoiSub()
Model robust optimal subsampling for A- and L- optimality criteria under Poisson regression

Model Misspecified Sampling

Sampling algorithms under a potentially misspecified model

modelMissLinSub()
Sampling under linear regression for a potentially misspecified model
modelMissLogSub()
Sampling under logistic regression for a potentially misspecified model
modelMissPoiSub()
Sampling under Poisson regression for a potentially misspecified model

Supporting Functions

Functions to plot the results from the outputs

plot_Beta()
Plotting model parameter outputs after sampling
plot_AMSE()
Plotting AMSE outputs for the samples under model misspecification