25 Non-scale outcomes

Standard regression models are linear combinations of parameters estimated from the data. Multiplying these parameters by different values of the predictor variables gives estimates of the outcome.

However, because there’s no hard limit on the range of predictor variables (at least, no limit coded into the model itself) the predictions of a linear model in theory range between negative -∞ (infinity) and +∞. Although values approaching infinity might be very unlikely, there is no hard limit on either the parameters we fit (the regression coefficients) or the predictor values themselves.

Where outcome data are continuous or somewhat like a continuous variable this isn’t usually a problem. Although our models might predict some improbable values (for example, that someone is 8 feet tall), they will not often be strictly impossible.

It might occur to you at this point that, if a model predicted a height of -8 feet or a temperature below absolute zero, then this would be impossible. And this is true, and a theoretical violation of the assumption of the linear model that the outcome can range betweem -∞ (infinity) and +∞. However researchers use linear regression to predict many outcomes which have this type of range restriction and, although models can make strange predictions in edge cases, they are useful and can make good predictions most of the time.

However for other types of outcome — incuding binary or count data, or other quantities like the duration-to-failure — this often won’t be the case, and standard linear regression may fail to make sensible predictions even in cases that are not unusual.

For binary data we want to predict the probability of a positive response, and this can range between zero and 1. For count data, predicted outcomes must always be non-negative (i.e. zero or greater). For these data, the lack of constraint on predictions from linear regression are a problem.