Bayesian Quantile Regression


Bayesian Quantile Regression

As discussed in an earlier post quantile regression is a powerful statistical technique that allows for the estimation of conditional quantiles of the response variable. Unlike traditional regression models that focus on the mean, quantile regression provides insights into the distribution of the dependant variable at different quantiles. This enables a more comprehensive understanding of the relationship between predictors and the response across the entire distribution. In recent years, the Bayesian approach to quantile regression using the probabilistic programming language Stan has gained popularity due to its flexibility and ability to incorporate prior knowledge. The Bayesian framework provides a natural way to account for uncertainty in quantile regression. By specifying prior distributions for the model parameters, we can incorporate existing knowledge or beliefs about the relationships between predictors and quantiles. This is particularly useful in situations where limited data are available or when prior information from previous studies is available.

In several posts we presented Stan as a powerful probabilistic programming language that allows for flexible and efficient Bayesian modeling. It provides a user-friendly interface for specifying complex statistical models and performs efficient inference using Hamiltonian Monte Carlo (HMC) sampling. Bayesian quantile Regression uses the capabilities of Stan to estimate the conditional quantiles of a response variable. The key idea behind Bayesian quantile Regression with Stan is to model the conditional quantiles as a function of the predictor variables using a hierarchical Bayesian framework. The model assumes that the response variable follows a distribution that depends on the predictor variables and a set of latent variables. The latent variables capture the uncertainty in the estimation of the quantiles and are assigned prior distributions. In the Bayesian quantile regression framework, the model is specified by defining prior distributions for the regression coefficients and other relevant parameters. Using Markov chain Monte Carlo (MCMC) Stan provides a joint posterior distribution that characterizes the uncertainty in the estimates, allowing for probabilistic inference and hypothesis testing.

One advantage of the Bayesian approach to quantile regression is its ability to handle complex and flexible models. This is particularly important in healthcare research, where the relationships between predictors and health outcomes are often multifaceted and may vary across different quantiles. The Bayesian framework allows for modeling these complexities, enabling a more nuanced understanding of the factors influencing health outcomes. Another advantage of Bayesian quantile regression with Stan is its ability to incorporate informative priors. Prior knowledge or expert opinions can be explicitly incorporated into the model by assigning prior distributions to the parameters. This is especially valuable in healthcare research, where domain expertise can guide the specification of prior distributions, leading to more accurate and interpretable results. Informative priors help to regularize the estimation process, particularly when the data are limited or when there is a need to borrow strength from related studies.

Another advantage of Bayesian quantile Regression is its ability to provide uncertainty estimates for the estimated quantiles. The posterior distribution of the parameters obtained from the MCMC sampling provides a measure of uncertainty for the estimated quantiles. This allows for a more comprehensive understanding of the relationship between variables and provides a basis for decision-making under uncertainty. In addition to estimating the conditional quantiles, Bayesian quantile Regression with Stan also allows for hypothesis testing and model comparison. Hypothesis tests can be performed by comparing the posterior distributions of the parameters to a null hypothesis. Model comparison can be done using techniques such as the deviance information criterion (DIC) or the widely applicable information criterion (WAIC).

Bayesian quantile Regression with Stan is a powerful and flexible approach for estimating the conditional quantiles of a response variable, by incorporating prior information and providing uncertainty estimates, it allows for better decision-making under uncertainty. With its ability to incorporate prior knowledge and handle complex models, Bayesian quantile regression with Stan facilitates more accurate and nuanced analyses, enhancing our understanding of the factors influencing health outcomes. Likewise, by explicitly modeling the conditional quantiles of the response variable, the approach provides a comprehensive understanding of the relationships between predictors and health outcomes with a great promise for advancing healthcare research and informing evidence-based decision-making in clinical practice and policy settings.

To illustrate the use of Bayesian quantile Regression with Stan, let’s consider an example with synthetic heteroskedastic data.

Model Specification

Using bayesQR package

Simulate data from heteroskedastic regression

    set.seed(66)
    n <- 200
    X <- runif(n=n,min=0,max=10)
    X <- X
    y <- 1 + 2*X + rnorm(n=n, mean=0, sd=.6*X)

Estimate series of quantile regressions with adaptive lasso to limit execution time of the example, ndraw is set to a very low value. Set value to 5000 for a better approximation of the posterior distirubtion.

    out <- bayesQR(y~X, quantile=c(.05,.25,.5,.75,.95), alasso=TRUE, ndraw=500)

Initiate plot

    plot(X, y, main="", cex=.6, xlab="X")
    ## Add quantile regression lines to the plot (exclude first 500 burn-in draws)
    sum <- summary(out, burnin=50)
    for (i in 1:length(sum)){
      abline(a=sum[[i]]$betadraw[1,1],b=sum[[i]]$betadraw[2,1],lty=i,col=i)
    }
outOLS = lm(y~X)
    abline(outOLS,lty=1,lwd=2,col=6)
    # Add legend to plot
    legend(x=0,y=max(y),legend=c(.05,.25,.50,.75,.95,"OLS"),lty=c(1,2,3,4,5,1),
           lwd=c(1,1,1,1,1,2),col=c(1:6),title="Quantile")

Using brms package

n <- 200
x <- runif(n = n, min = 0, max = 10)
y <- 1 + 2 * x + rnorm(n = n, mean = 0, sd = 0.6*x)
dat <- data.frame(x, y)
# fit the 20%-quantile
fit <- brm(bf(y ~ x, quantile = 0.2), data = dat, family = asym_laplace())
summary(fit)

Backend stan model

functions {
  /* helper function for asym_laplace_lpdf
   * Args:
   *   y: the response value
   *   quantile: quantile parameter in (0, 1)
   */
   real rho_quantile(real y, real quantile) {
     if (y < 0) {
       return y * (quantile - 1);
     } else {
       return y * quantile;
     }
   }
  /* asymmetric laplace log-PDF for a single response
   * Args:
   *   y: the response value
   *   mu: location parameter
   *   sigma: positive scale parameter
   *   quantile: quantile parameter in (0, 1)
   * Returns:
   *   a scalar to be added to the log posterior
   */
   real asym_laplace_lpdf(real y, real mu, real sigma, real quantile) {
     return log(quantile * (1 - quantile)) -
            log(sigma) -
            rho_quantile((y - mu) / sigma, quantile);
   }
  /* asymmetric laplace log-CDF for a single quantile
   * Args:
   *   y: a quantile
   *   mu: location parameter
   *   sigma: positive scale parameter
   *   quantile: quantile parameter in (0, 1)
   * Returns:
   *   a scalar to be added to the log posterior
   */
   real asym_laplace_lcdf(real y, real mu, real sigma, real quantile) {
     if (y < mu) {
       return log(quantile) + (1 - quantile) * (y - mu) / sigma;
     } else {
       return log1m((1 - quantile) * exp(-quantile * (y - mu) / sigma));
     }
   }
  /* asymmetric laplace log-CCDF for a single quantile
   * Args:
   *   y: a quantile
   *   mu: location parameter
   *   sigma: positive scale parameter
   *   quantile: quantile parameter in (0, 1)
   * Returns:
   *   a scalar to be added to the log posterior
   */
   real asym_laplace_lccdf(real y, real mu, real sigma, real quantile) {
     if (y < mu) {
       return log1m(quantile * exp((1 - quantile) * (y - mu) / sigma));
     } else {
       return log1m(quantile) - quantile * (y - mu) / sigma;
     }
   }
}
data {
  int<lower=1> N;  // total number of observations
  vector[N] Y;  // response variable
  int<lower=1> K;  // number of population-level effects
  matrix[N, K] X;  // population-level design matrix
  int prior_only;  // should the likelihood be ignored?
}
transformed data {
  int Kc = K - 1;
  matrix[N, Kc] Xc;  // centered version of X without an intercept
  vector[Kc] means_X;  // column means of X before centering
  for (i in 2:K) {
    means_X[i - 1] = mean(X[, i]);
    Xc[, i - 1] = X[, i] - means_X[i - 1];
  }
}
parameters {
  vector[Kc] b;  // population-level effects
  real Intercept;  // temporary intercept for centered predictors
  real<lower=0> sigma;  // dispersion parameter
}
transformed parameters {
  real quantile = 0.2;  // quantile parameter
  real lprior = 0;  // prior contributions to the log posterior
  lprior += student_t_lpdf(Intercept | 3, 11, 7.8);
  lprior += student_t_lpdf(sigma | 3, 0, 7.8)
    - 1 * student_t_lccdf(0 | 3, 0, 7.8);
}
model {
  // likelihood including constants
  if (!prior_only) {
    // initialize linear predictor term
    vector[N] mu = rep_vector(0.0, N);
    mu += Intercept + Xc * b;
    for (n in 1:N) {
      target += asym_laplace_lpdf(Y[n] | mu[n], sigma, quantile);
    }
  }
  // priors including constants
  target += lprior;
}
generated quantities {
  // actual population-level intercept
  real b_Intercept = Intercept - dot_product(means_X, b);
}

This Stan code specifies a Bayesian model for asymmetric Laplace regression, where the main goal is to estimate the population-level effects and dispersion parameter of the model from the provided data. The asymmetric Laplace distribution is used as the likelihood function for the response variable.

  1. The functions block contains three helper functions: rho_quantile, asym_laplace_lpdf, and asym_laplace_lccdf. These functions are used to calculate the asymmetric Laplace log-PDF, log-CDF, and log-CCDF for a single response variable.

  2. The data block defines the input data for the model, including the total number of observations N, the response variable Y, the number of population-level effects K, the population-level design matrix X, and a binary variable prior_only that indicates whether to ignore the likelihood (for prior-only sampling).

  3. The transformed data block preprocesses the data. It calculates the centered version of the design matrix Xc, removes the intercept from the design matrix X, and stores the column means of X before centering in the vector means_X.

  4. The parameters block defines the parameters to be estimated in the model. It includes the population-level effects b, the temporary intercept for centered predictors Intercept, and the dispersion parameter sigma.

  5. The transformed parameters block calculates the quantile parameter quantile (set to 0.2 in this case) and the prior contributions to the log posterior (lprior). The lprior term includes the priors for the Intercept and sigma parameters, which are specified as Student’s t-distributions.

  6. The model block defines the likelihood and priors for the model. The likelihood accounts for the asymmetric Laplace distribution for the response variable Y, given the linear predictor mu (calculated using the population-level effects b and Intercept) and the dispersion parameter sigma. If prior_only is true, the likelihood is ignored, and the model only considers the priors.

  7. The generated quantities block computes the actual population-level intercept b_Intercept by removing the effect of the centered predictors from the temporary intercept Intercept.

References

  • Yu, K., & Moyeed, R. A. (2001). Bayesian quantile regression. Statistics & Probability Letters, 54(4), 437-447.
  • Kottas, A., & Gelfand, A. E. (2001). Bayesian semiparametric median regression modeling. Journal of the American Statistical Association, 97(457), 109-121.
  • Koenker, R., & Xiao, Z. (2006). Quantile autoregression. Journal of the American Statistical Association, 101(475), 980-990.
  • Yu, K., & Moyeed, R. A. (2000). Bayesian quantile regression. Journal of the Royal Statistical Society: Series D (The Statistician), 49(3), 385-392.
  • Koenker, R., & Xiao, Z. (2004). Inference on the quantile regression process. Econometrica, 72(1), 71-104.
  • https://cran.r-project.org/web/packages/bayesQR/bayesQR.pdf