mdmb_regression.Rd
Several regression functions which allow for sampling weights and prior distributions.
The function yjt_regression
performs a linear regression in which the
response variable is transformed according to the Yeo-Johnson transformation
(Yeo & Johnson, 2000; see yjt_dist
) and the residuals are
distributed following the scaled \(t\) distribution. The degrees of freedom
of the \(t\) distribution can be fixed or estimated (est_df=TRUE
).
The function bct_regression
has same functionality like the
Yeo-Johnson transformation but employs a Box-Cox transformation
of the outcome variable.
The Yeo-Johnson transformation can be extended by a probit transformation
(probit=TRUE
) to cover the case of bounded variables on \([0,1]\).
The function logistic_regression
performs logistic regression
for dichotomous data.
The function oprobit_regression
performs ordinal probit regression
for ordinal polytomous data.
#---- linear regression with Yeo-Johnson transformed scaled t distribution yjt_regression(formula, data, weights=NULL, beta_init=NULL, beta_prior=NULL, df=Inf, lambda_fixed=NULL, probit=FALSE, est_df=FALSE, df_min=0.5, df_max=100, use_grad=2, h=1e-5, optimizer="optim", maxiter=300, control=NULL) # S3 method for yjt_regression coef(object, ...) # S3 method for yjt_regression logLik(object, ...) # S3 method for yjt_regression predict(object, newdata=NULL, trafo=TRUE, ...) # S3 method for yjt_regression summary(object, digits=4, file=NULL, ...) # S3 method for yjt_regression vcov(object, ...) #---- linear regression with Box-Cox transformed scaled t distribution bct_regression(formula, data, weights=NULL, beta_init=NULL, beta_prior=NULL, df=Inf, lambda_fixed=NULL, est_df=FALSE, use_grad=2, h=1e-5, optimizer="optim", maxiter=300, control=NULL) # S3 method for bct_regression coef(object, ...) # S3 method for bct_regression logLik(object, ...) # S3 method for bct_regression predict(object, newdata=NULL, trafo=TRUE, ...) # S3 method for bct_regression summary(object, digits=4, file=NULL, ...) # S3 method for bct_regression vcov(object, ...) #---- logistic regression logistic_regression(formula, data, weights=NULL, beta_init=NULL, beta_prior=NULL, use_grad=2, h=1e-5, optimizer="optim", maxiter=300, control=NULL) # S3 method for logistic_regression coef(object, ...) # S3 method for logistic_regression logLik(object, ...) # S3 method for logistic_regression predict(object, newdata=NULL, ...) # S3 method for logistic_regression summary(object, digits=4, file=NULL, ...) # S3 method for logistic_regression vcov(object, ...) #---- ordinal probit regression oprobit_regression(formula, data, weights=NULL, beta_init=NULL, use_grad=2, h=1e-5, optimizer="optim", maxiter=300, control=NULL, control_optim_fct=NULL) # S3 method for oprobit_regression coef(object, ...) # S3 method for oprobit_regression logLik(object, ...) # S3 method for oprobit_regression predict(object, newdata=NULL, ...) # S3 method for oprobit_regression summary(object, digits=4, file=NULL, ...) # S3 method for oprobit_regression vcov(object, ...)
formula | Formula |
---|---|
data | Data frame. The dependent variable must be coded as 0 and 1. |
weights | Optional vector of sampling weights |
beta_init | Optional vector of initial regression coefficients |
beta_prior | Optional list containing priors of all parameters (see Examples for definition of this list). |
df | Fixed degrees of freedom for scaled \(t\) distribution |
lambda_fixed | Optional fixed value for \(\lambda\) for scaled \(t\) distribution with Yeo-Johnson transformation |
probit | Logical whether probit transformation should be employed for
bounded outcome in |
est_df | Logical indicating whether degrees of freedom in \(t\) distribution should be estimated. |
df_min | Minimum value for estimated degrees of freedom |
df_max | Maximum value for estimated degrees of freedom |
use_grad | Computation method for gradients in |
h | Numerical differentiation parameter. |
optimizer | Type of optimizer to be chosen. Options are
|
maxiter | Maximum number of iterations |
control | Optional arguments to be passed to optimization function
( |
control_optim_fct | Optional control argument for gradient in optimization |
object | Object of class |
newdata | Design matrix for |
trafo | Logical indicating whether fitted values should be on the
transformed metric ( |
digits | Number of digits for rounding |
file | File name if the |
... | Further arguments to be passed. |
List containing values
Estimated regression coefficients
Estimated covariance matrix
Parameter table
Vector of values of dependent variable
Design matrix
Sampling weights
Fitted values in metric of probabilities
Fitted values in metric of logits
Log likelihood value
Log prior value
Log posterior value
Deviance
Case-wise likelihood
Information criteria
Pseudo R-square value according to McKelvey and Zavoina
McKelvey, R., & Zavoina, W. (1975). A statistical model for the analysis of ordinal level dependent variables. Journal of Mathematical Sociology, 4(1), 103-120. doi: 10.1080/0022250X.1975.9989847
Yeo, I.-K., & Johnson, R. (2000). A new family of power transformations to improve normality or symmetry. Biometrika, 87(4), 954-959. doi: 10.1093/biomet/87.4.954
See yjt_dist
or car::yjPower
for functions for the Yeo-Johnson transformation.
See stats::lm
and
stats::glm
for linear and logistic
regression models.
############################################################################# # EXAMPLE 1: Simulated example logistic regression ############################################################################# #--- simulate dataset set.seed(986) N <- 500 x <- stats::rnorm(N) y <- 1*( stats::runif(N) < stats::plogis( -0.8 + 1.2 * x ) ) data <- data.frame( x=x, y=y ) #--- estimate logistic regression with mdmb::logistic_regression mod1 <- mdmb::logistic_regression( y ~ x, data=data ) summary(mod1) if (FALSE) { #--- estimate logistic regression with stats::glm mod1b <- stats::glm( y ~ x, data=data, family="binomial") summary(mod1b) #--- estimate logistic regression with prior distributions b0 <- list( "dnorm", list(mean=0, sd=100) ) # first parameter b1 <- list( "dcauchy", list(location=0, scale=2.5) ) # second parameter beta_priors <- list( b0, b1 ) # order in list defines priors for parameters #* estimation mod2 <- mdmb::logistic_regression( y ~ x, data=data, beta_prior=beta_priors ) summary(mod2) ############################################################################# # EXAMPLE 2: Yeo-Johnson transformed scaled t regression ############################################################################# #*** create simulated data set.seed(9865) n <- 1000 x <- stats::rnorm(n) y <- .5 + 1*x + .7*stats::rt(n, df=8 ) y <- mdmb::yj_antitrafo( y, lambda=.5 ) dat <- data.frame( y=y, x=x ) # display data graphics::hist(y) #--- Model 1: fit regression model with transformed normal distribution (df=Inf) mod1 <- mdmb::yjt_regression( y ~ x, data=dat ) summary(mod1) #--- Model 2: fit regression model with transformed scaled t distribution (df=10) mod2 <- mdmb::yjt_regression( y ~ x, data=dat, df=10) summary(mod2) #--- Model 3: fit regression model with transformed normal distribution (df=Inf) # and fixed transformation parameter lambda of .5 mod3 <- mdmb::yjt_regression( y ~ x, data=dat, lambda_fixed=.5) summary(mod3) #--- Model 4: fit regression model with transformed normal distribution (df=Inf) # and fixed transformation parameter lambda of 1 # -> This model corresponds to least squares regression mod4 <- mdmb::yjt_regression( y ~ x, data=dat, lambda_fixed=1) summary(mod4) # fit with lm function mod4b <- stats::lm( y ~ x, data=dat ) summary(mod4b) #--- Model 5: fit regression model with estimated degrees of freedom mod5 <- mdmb::yjt_regression( y ~ x, data=dat, est_df=TRUE) summary(mod5) #** compare log-likelihood values logLik(mod1) logLik(mod2) logLik(mod3) logLik(mod4) logLik(mod4b) logLik(mod5) ############################################################################# # EXAMPLE 3: Regression with Box-Cox and Yeo-Johnson transformations ############################################################################# #*** simulate data set.seed(985) n <- 1000 x <- stats::rnorm(n) y <- .5 + 1*x + stats::rnorm(n, sd=.7 ) y <- mdmb::bc_antitrafo( y, lambda=.5 ) dat <- data.frame( y=y, x=x ) #--- Model 1: fit regression model with Box-Cox transformation mod1 <- mdmb::bct_regression( y ~ x, data=dat ) summary(mod1) #--- Model 2: fit regression model with Yeo-Johnson transformation mod2 <- mdmb::yjt_regression( y ~ x, data=dat ) summary(mod2) #--- compare fit logLik(mod1) logLik(mod2) ############################################################################# # EXAMPLE 4: Ordinal probit regression ############################################################################# #--- simulate data set.seed(987) N <- 1500 x <- stats::rnorm(N) z <- stats::rnorm(N) # regression coefficients b0 <- -.5 ; b1 <- .6 ; b2 <- .1 # vector of thresholds thresh <- c(-1, -.3, 1) yast <- b0 + b1 * x + b2*z + stats::rnorm(N) y <- as.numeric( cut( yast, c(-Inf,thresh,Inf) ) ) - 1 dat <- data.frame( x=x, y=y, z=z ) #--- probit regression mod <- mdmb::oprobit_regression( formula=y ~ x + z + I(x*z), data=dat) summary(mod) }