Chapter 09: Models for the Binomial family

library(glmbayes)

1. Introductory Discussion

Binomial generalized linear models (GLMs) are used when the response represents binary outcomes (success/failure) or proportions (successes out of trials). They are among the most widely used GLMs in applied statistics, powering models for:

Binomial regression is a standard generalized linear model (Nelder and Wedderburn 1972; McCullagh and Nelder 1989; Agresti 2015).

In classical statistics, these models are fit using:

glm(..., family = binomial(link = ...))

In glmbayes, the Bayesian analogue is:

glmb(..., family = binomial(link = ...), pfamily = dNormal(mu, Sigma))

This chapter introduces:

  1. the structure of binomial GLMs
  2. the available link functions (logit, probit, cloglog)
  3. how to specify these models in glmbayes
  4. worked examples for each link function

We build on the foundations from Chapters 07 and 08, especially the role of link functions, log‑concavity, and prior specification.

2. Binomial Likelihood and Weighted Formulation

Binomial data arise in several equivalent representations:

In all cases, the underlying sampling model is

\[ Y_i \sim \text{Binomial}(n_i, \mu_i), \qquad 0 < \mu_i < 1, \]

where: - \(n_i\) is the number of trials, - \(\mu_i = \Pr(Y_i = 1)\) is the success probability.

2.1 Linear predictor and mean structure

A binomial GLM links the mean \(\mu_i\) to a linear predictor through

\[ \eta_i = x_i^\top \beta, \qquad \mu_i = g^{-1}(\eta_i), \]

where \(g(\cdot)\) is the chosen link function (logit, probit, cloglog, etc.).

2.2 Weighted binomial log‑likelihood

Using weights \(w_i = n_i\), the log‑likelihood (up to constants) becomes

\[ \ell(\beta) = \sum_{i=1}^n w_i\Big[ y_i \log(\mu_i) + (1-y_i)\log(1-\mu_i) \Big]. \]

This form is used by both glm() and the Bayesian functions glmb() and rglmb().

2.3 Exponential‑family representation

The binomial likelihood belongs to the exponential family (McCullagh and Nelder 1989; Agresti 2015).
For a model with linear predictor \[ \eta_i = x_i^\top \beta, \] and mean \[ \mu_i = g^{-1}(\eta_i), \] the contribution of observation \(i\) to the log‑likelihood can be written as \[ \ell_i(\beta) = w_i\Big[ y_i \log(\mu_i) + (1-y_i)\log(1-\mu_i) \Big], \] where \(w_i\) is the number of trials (or a user‑supplied weight).

This representation does not require the link to be canonical.
The variance of a binomial observation is always \[ \mathrm{Var}(Y_i) = \mu_i(1-\mu_i), \] regardless of the link function.

3. Specifying Binomial Models in glmbayes

The general Bayesian call is:

glmb(
  formula,
  family   = binomial(link = "logit" | "probit" | "cloglog"),
  pfamily  = dNormal(mu = mu, Sigma = V),
  data     = ...
)

3.1 Prior Specification

As in earlier chapters, the recommended workflow is:

ps <- Prior_Setup(formula, family = binomial(link = "logit"), data = ...)
mu <- ps$mu
V  <- ps$Sigma

This produces:

You may override these defaults for more informative priors (see Chapter 14).


8. Concluding Discussion

Binomial GLMs are a core component of the glmbayes package. Their log‑concave likelihoods make them ideal for the envelope‑based accept‑reject sampler, and the familiar link functions allow analysts to choose models that match the scientific context (McCullagh and Nelder 1989; Gelman et al. 2013).

This chapter demonstrated:

In the next chapter, we extend these ideas to Poisson models, which share many structural similarities but introduce new considerations for count data.


Appendix A. Bayes Rules! companion — Perth rain (weather_perth)

Book: (Johnson et al. 2022), Chapter 13 — same model and informative priors as Chapter 08, Appendix A.
Model: raintomorrow ~ humidity9am (logit link)

Requires bayesrules.

library(bayesrules)
weather <- bayesrules::weather_perth
weather$raintomorrow <- as.integer(weather$raintomorrow == "Yes")
mu_w <- matrix(c(-1.4, 0.07), nrow = 1)
colnames(mu_w) <- c("(Intercept)", "humidity9am")
Sigma_w <- diag(c(0.7^2, 0.035^2))
dimnames(Sigma_w) <- list(colnames(mu_w), colnames(mu_w))
book_br09 <- data.frame(
  parameter = c("(Intercept)", "humidity9am"),
  book_lo   = c(-5.08785, 0.04147),
  book_hi   = c(-4.13450, 0.05487),
  book_mid  = c(-4.611175, 0.04817),
  check.names = FALSE
)
set.seed(2026)
glmb_rain <- glmb(
  raintomorrow ~ humidity9am,
  family  = binomial(),
  pfamily = dNormal(mu = mu_w, Sigma = Sigma_w),
  data    = weather,
  n       = 2000
)
#> [glmb_Standardize_Model][NOTE] Posterior Hessian is moderately ill-conditioned.
#>   kappa(H) = 67316.9
print(glmb_rain)
#> 
#> Call:  glmb(formula = raintomorrow ~ humidity9am, family = binomial(), 
#>     pfamily = dNormal(mu = mu_w, Sigma = Sigma_w), n = 2000, 
#>     data = weather)
#> 
#> Posterior Mean Coefficients:
#> (Intercept)  humidity9am  
#>    -3.94628      0.03891  
#> 
#> Effective Number of Parameters: 1.76987 
#> Expected Residual Deviance: 874.5654 
#> DIC: 876.3353
br09_compare <- data.frame(
  parameter = book_br09$parameter,
  `Book 80% lo` = book_br09$book_lo,
  `Book 80% hi` = book_br09$book_hi,
  `Book midpoint` = book_br09$book_mid,
  `glmb Post.Mean` = as.numeric(glmb_rain$coef.means[book_br09$parameter]),
  `glmb Post.Sd`   = sapply(book_br09$parameter, function(p)
    sd(glmb_rain$coefficients[, p, drop = TRUE])),
  check.names = FALSE
)
knitr::kable(br09_compare, digits = 4,
  caption = "Bayes Rules! Ch. 13 vs. glmb() (informative priors)")
Bayes Rules! Ch. 13 vs. glmb() (informative priors)
parameter Book 80% lo Book 80% hi Book midpoint glmb Post.Mean glmb Post.Sd
(Intercept) (Intercept) -5.0879 -4.1345 -4.6112 -3.9463 0.3172
humidity9am humidity9am 0.0415 0.0549 0.0482 0.0389 0.0046

Multivariable extensions (humidity3pm, raintoday) use Prior_Setup() defaults on extra coefficients unless you add further informative book priors (Chapter 14).

References

Agresti, Alan. 2015. Foundations of Linear and Generalized Linear Models. Cambridge University Press.
Gelman, Andrew, John B. Carlin, Hal S. Stern, David B. Dunson, Aki Vehtari, and Donald B. Rubin. 2013. Bayesian Data Analysis. 3rd ed. CRC Press.
Griffin, Jim E., and Philip J. Brown. 2010. “Inference with Normal-Gamma Prior Distributions in Regression Problems.” Bayesian Analysis 5 (1): 171–88. https://doi.org/10.1214/10-BA507.
Johnson, Alicia A., Miles Q. Ott, and Mine Dogucu. 2022. Bayes Rules! An Introduction to Applied Bayesian Modeling. CRC Press. https://www.bayesrulesbook.com.
McCullagh, P., and J. A. Nelder. 1989. Generalized Linear Models. Chapman; Hall.
Nelder, J. A., and R. W. M. Wedderburn. 1972. “Generalized Linear Models.” Journal of the Royal Statistical Society. Series A (General) 135 (3): 370–84. https://doi.org/10.2307/2344614.
Spiegelhalter, David J., Nicky G. Best, Bradley P. Carlin, and Angelika van der Linde. 2002. “Bayesian Measures of Model Complexity and Fit.” Journal of the Royal Statistical Society: Series B (Statistical Methodology) 64 (4): 583–639. https://doi.org/10.1111/1467-9868.00353.