## 25.1 Link functions

Logistic and poisson regression extend regular linear regression to allow us to
*constrain* linear regression to predict within the rannge of possible outcomes.
To achieve this, logistic regression, poisson regression and other members of
the family of ‘generalised linear models’ use different ‘link functions’.

Link functions are used to connect the outcome variable to the linear model (that is, the linear combination of the parameters estimated for each of the predictors in the model). This means we can use linear models which still predict between -∞ and +∞, but without making inappropriate predictions.

For linear regression the link is simply the identity function — that is, the linear model directly predicts the outcome. However for other types of model different functions are used.

A good way to think about link functions is as a *transformation* of the model’s
predictions. For example, in logistic regression predictions from the linear
model are transformed in such a way that they are constrained to fall between 0
and 1. Thus, although to the underlying linear model allows values between -∞
and +∞, the link function ensures predictions fall between 0 and 1.

### Logistic regression

When we have binary data, we want to be able run something like regression, but
where we predict a *probability* of the outcome.

Because probabilities are limited to between 0 and 1, to link the data with the linear model we need to transform so they range from -∞ (infinity) to +∞.

You can think of the solution as coming in two steps:

#### Step 1

We can transform a probability on the 0—1 scale to a \(0 \rightarrow ∞\) scale
by converting it to *odds*, which are expressed as a ratio:

\[odds = \dfrac{p}{1-p}\]

Probabilities and odds ratios are two *equivalent* ways of expressing the same
idea.

So a probability of .5 equates to an odds ratio of 1 (i.e. 1 to 1); *p*=.6
equates to odds of 1.5 (that is, 1.5 to 1, or 3 to 2), and *p* = .95 equates to
an odds ratio of 19 (19 to 1).

Odds convert or *map* probabilities from 0 to 1 onto the
real numbers from 0 to ∞.

We can reverse the transformation too (which is important later) because:

\[\textrm{probability} = \dfrac{\textrm{odds}}{1+\textrm{odds}}\]

If a bookie gives odds of 66:1 on a horse, what probability does he think it has of winning?

Why do bookies use odds and not probabilities?

Should researchers use odds or probabilities when discussing with members of the public?

#### Step 2

When we convert a probability to odds, the odds will always be > zero.

This is still a problem for our linear model. We’d like our ‘regression’ coefficients to be able to vary between -∞ and ∞.

To avoid this restriction, we can take the *logarithm* of the odds — sometimes
called the *logit*.

The figure below shows the transformation of probabilities between 0 and 1 to the log-odds scale. The logit has two nice properties:

It converts odds of less than one to negative numbers, because the

*log*of a number between 0 and 1 is always negative[^1].It flattens the rather square curve for the odds in the figure above, and

#### Reversing the process to interpret the model

As we’ve seen here, the logit or logistic link function transforms probabilities between 0/1 to the range from negative to positive infinity.

This means logistic regression coefficients are in log-odds units, so we must interpret logistic regression coefficients differently from regular regression with continuous outcomes.

In linear regression, the coefficient is the change the outcome for a unit change in the predictor.

For logistic regression, the coefficient is the

*change in the log odds of the outcome being 1, for a unit change in the predictor*.

If we want to interpret logistic regression in terms of probabilities, we need to undo the transformation described in steps 1 and 2. To do this:

We take the exponent of the logit to ‘undo’ the log transformation. This gives us the

*predicted odds*.We convert the odds back to probability.

#### A hypothetical example

Imagine if we have a model to predict whether a person has any children. The outcome is binary, so equals 1 if the person has any children, and 0 otherwise.

The model has an intercept and one predictor, \(age\) in years. We estimate two parameters: \(\beta_0 = 0.5\) and \(\beta_{1} = 0.02\).

The outcome (\(y\)) of the linear model is the log-odds.

The model prediction is: \(\hat{y} = \beta_0 + \beta_1\textrm{age}\)

So, for someone aged 30:

- the predicted log-odds = \(0.5 + 0.02 * 30 = 1.1\)
- the predicted odds = \(exp(1.1) = 3.004\)
- the predicted probability = \(3.004 / (1 + 3.004) = .75\)

For someone aged 40:

- the predicted log-odds = \(0.5 + 0.02 * 40 = 1.3\)
- the predicted odds = \(exp(1.3) = 3.669\)
- the predicted probability = \(3.669 / (1 + 3.669) = .78\)

#### 25.1.0.1 Regression coefficients are odds ratios

One final twist: In the section above we said that in logistic regression the
coefficients are the *change in the log odds of the outcome being 1, for a unit
change in the predictor*.

Without going into too much detail, one nice fact about logs is that if you take the log of two numbers and subtract them to take the difference, then this is equal to dividing the same numbers and then taking the log of the result:

```
A <- log(1)-log(5)
B <- log(1/5)
# we have to use rounding because of limitations in
# the precision of R's arithmetic, but A and B are equal
round(A, 10) == round(B, 10)
[1] TRUE
```

This means that *change in the log-odds* is the same as the *ratio of the odds*

So, once we undo the log transformation by taking the exponent of the
coefficient, we are left with the *odds ratio*.

You can now jump back to running logistic regression.