SMMAL is an R package for estimating the Average Treatment Effect (ATE) using semi-supervised learning (SSL), tailored for settings with limited treatment/outcome labels but rich covariates and surrogate variables. It enhances efficiency and robustness over supervised methods by leveraging unlabeled data and supports high-dimensional models via cross-fitting, flexible model fitting, and adaptive LASSO.
# install.packages("devtools")
::install_github("ShuhengKong/SMMAL") devtools
A github version can be found at this link: https://github.com/ShuhengKong/SMMAL
This is a basic example which shows you how to solve a common problem:
library(SMMAL)
# Load the example dataset included with the package
<- system.file("extdata", "sample_data.rds", package = "SMMAL")
file_path <- readRDS(file_path)
dat
<- data.frame(dat$X)
temp <- NA
temp[,] # Estimate ATE using the SMMAL pipeline
<- SMMAL(
output Y = dat$Y,
A = dat$A,
S = data.frame(dat$S),
X = data.frame(dat$X),
nfold = 5,
cf_model = "bspline"
)
# View the results
print(output)
#> $est
#> [1] 0.1021349
#>
#> $se
#> [1] 0.03006258
Column | Description |
---|---|
Y | Observed outcomes. Can be continuous or binary |
A | Treatment indicator. Must be binary |
S | Surrogates |
X | Covariates |
nfold | Number of cross-validation folds. Default is 5. |
cf_model | The modeling method to use in cross-fitting. Default is “bspline”. Other values are “xgboost”,“randomforest” |
custom_model_fun | Optional user-supplied function for feature selection or prediction. Overrides the built-in model fitting. Must return fold-level predictions. |
Column | Description |
---|---|
est | estimated value of ATE |
se | standard error of ATE |