Working with character strings
18.0.1 Searching and replacing
If you want to search inside a string there are lots of useful functions in the
stringr:: library. These replicate some functionality in base R, but like
other packages in the ‘tidyverse’ they tend to be more consistent and easier to
use. For example:
cheese <- c("Stilton", "Brie", "Cheddar")
stringr::str_detect(cheese, "Br")
[1] FALSE TRUE FALSE
stringr::str_locate(cheese, "i")
start end
[1,] 3 3
[2,] 3 3
[3,] NA NA
stringr::str_replace(cheese, "Stil", "Mil")
[1] "Milton" "Brie" "Cheddar"
Using paste to make labels
Paste can combine character strings with other types of variable to produce a new vector:
paste(mtcars$cyl, "cylinders")[1:10]
[1] "6 cylinders" "6 cylinders" "4 cylinders" "6 cylinders" "8 cylinders"
[6] "6 cylinders" "8 cylinders" "4 cylinders" "4 cylinders" "6 cylinders"
Which can be a useful way to label graphs:
mtcars %>%
ggplot(aes(paste(mtcars$cyl, "cylinders"), mpg)) +
geom_boxplot() + xlab("")

Fixing up variable after melting
In this example melt() creates a new column called variable.
sleep.wide %>%
melt(id.var="Subject") %>%
arrange(Subject, variable) %>%
head
Subject variable value
1 1 Day.0 249.5600
2 1 Day.1 258.7047
3 1 Day.2 250.8006
4 1 Day.3 321.4398
5 1 Day.4 356.8519
6 1 Day.5 414.6901
However the contents of variable are now a character string (i.e. a list of
letters and numbers) rather than numeric values (see
column types) but in this instance we know that the
values Day.1, Day.2… are not really separate categories but actually form
a linear sequence, from 1 to 9.
We can use the extract or separate functions to split up variable and
create a numeric column for Day:
sleep.long %>%
separate(variable, c("variable", "Day")) %>%
mutate(Day=as.numeric(Day)) %>%
arrange(Subject) %>%
head %>% pander
| Subject | variable | Day | value |
|---|---|---|---|
| 1 | Day | 0 | 249.6 |
| 1 | Day | 1 | 258.7 |
| 1 | Day | 2 | 250.8 |
| 1 | Day | 3 | 321.4 |
| 1 | Day | 4 | 356.9 |
| 1 | Day | 5 | 414.7 |
See the user guide for separate and extract for more details.
If you are familiar with
regular expressions
you will be happy to know that you can use regex to separate variables using
extract and separate.
See this guide for more details on how separate and extract work