This is the generic function and methods associated with the adjusted MLE for high dimensional generalized linear models (glm).
Details
The function adjust_glm is built on the glm function: the input should be from the
class glm, such as the output from glm.
Currently, it works with binary regressions only, for which it calls the function adjust_binary
to estimate coefficients and their standard deviations. In high dimensions,
the magnitude of MLE is biased upward, and the variance estimates from classical theory based on
inverse Fisher information underestimates the true variability. adjust_binary adjusts the MLE and variance estimates
applying the asymptotic theory assuming \(p, n\to\infty\) (\(p<n\)), and is exact when
covariates \(X\) is multivariate Gaussian.
adjust_glm creates an object from class glmadj. summary.glmadj creates a data table of adjusted coefficients,
estimates of the standard error and p-values using a two-sided t-test. If
the covariates are multivariate Gaussian, the p-values are asymptotically
from a uniform distribution.
predict.glmadj computes the prediction \(X^\top \hat{\beta}^{\mathrm{Adj}}\) for
new input data. The new data should be of the same format which you
used to fit the glm. If there's no new data, it outputs the estimated
\(X^\top \hat{\beta}_{\mathrm{Adj}}\) for the original dataset.
Examples
# Problem size
n <- 1000L
p <- 300L
# Generate data matrix
X <- matrix(rnorm(n*p, 0, 1), n, p)
# Generate parameters
beta <- rep(c(0, 1), p / 2) / sqrt(p) * 2
#' ## No intercept model
# Sample response vector
Y <- rbinom(n, 1, 1 / (1 + exp(- X %*% beta)))
# Fit a glm, by default computes a logistic regression
fit <- glm(Y ~ X + 0, family = binomial, x = TRUE, y = TRUE)
adjusted_fit <- adjust_glm(fit)
# Print summary
print(summary(adjusted_fit), max = 40)
#> Call:
#> adjust_glm(glm_output = fit)
#>
#> adjusted_mle std z.value p.value
#> X1 -0.00847596 0.09080651 -0.0933 0.9256328
#> X2 0.08228799 0.09158411 0.8985 0.3689210
#> X3 0.00225270 0.09113093 0.0247 0.9802788
#> X4 0.12776741 0.08180704 1.5618 0.1183317
#> X5 0.12208436 0.09093356 1.3426 0.1794124
#> X6 0.14182345 0.09224081 1.5375 0.1241624
#> X7 0.16738694 0.09266929 1.8063 0.0708742 .
#> X8 0.26618628 0.08890818 2.9939 0.0027539 **
#> [ reached getOption("max.print") -- omitted 292 rows ]
#> ---
#> Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
# Predict on new data
head(predict(adjusted_fit))
#> [1] 2.36189042 1.55461876 -1.53630085 -0.03311306 -1.12050853 2.36940721
# Extract adjusted coefficients
head(adjusted_fit$coef_adj)
#> X1 X2 X3 X4 X5 X6
#> -0.008475955 0.082287991 0.002252704 0.127767408 0.122084356 0.141823446
# Extract adjusted standard error
head(adjusted_fit$std_adj)
#> X1 X2 X3 X4 X5 X6
#> 0.1575873 0.1589367 0.1581503 0.1419694 0.1578077 0.1600764