Modifying and creating new columns

We often want to compute new columns from data we already have.

Imagine we had heights stored in cm, and weights stored in kg for 100 participants in a study on weight loss:

set.seed(1234)

weightloss <- tibble(
    height_cm = rnorm(100, 150, 20),
    weight_kg = rnorm(100, 65, 10)
)
weightloss %>% head
# A tibble: 6 x 2
  height_cm weight_kg
      <dbl>     <dbl>
1      126.      69.1
2      156.      60.3
3      172.      65.7
4      103.      60.0
5      159.      56.7
6      160.      66.7

If we want to compute each participants’ Body Mass Index, we first need to convert their height into meters. We do this with mutate:

weightloss %>%
  mutate(height_meters = height_cm / 100) %>%
  head
# A tibble: 6 x 3
  height_cm weight_kg height_meters
      <dbl>     <dbl>         <dbl>
1      126.      69.1          1.26
2      156.      60.3          1.56
3      172.      65.7          1.72
4      103.      60.0          1.03
5      159.      56.7          1.59
6      160.      66.7          1.60

We then want to calculate BMI:

weightloss %>%
  mutate(height_meters = height_cm / 100,
         bmi = weight_kg / height_meters ^ 2) %>%
  head
# A tibble: 6 x 4
  height_cm weight_kg height_meters   bmi
      <dbl>     <dbl>         <dbl> <dbl>
1      126.      69.1          1.26  43.7
2      156.      60.3          1.56  24.9
3      172.      65.7          1.72  22.3
4      103.      60.0          1.03  56.4
5      159.      56.7          1.59  22.6
6      160.      66.7          1.60  26.0

You could skip the intermediate step of converting to meters and write: bmi = weight_kg / (height_cm/100) ^ 2. But it’s often best to be explicit and simplify each operation.