Selecting columns
To pick out single or multiple columns use the select()
function.
The select()
function expects a dataframe as it’s first input (‘argument’, in
R language), followed by the names of the columns you want to extract with a
comma between each name.
It returns a new dataframe with just those columns, in the order you specified:
head(
select(mtcars, cyl, hp)
)
cyl hp
Mazda RX4 6 110
Mazda RX4 Wag 6 110
Datsun 710 4 93
Hornet 4 Drive 6 110
Hornet Sportabout 8 175
Valiant 6 105
Saving a subset of the data
Because dplyr
functions return a new dataframe, we can assign the results to
a variable:
justcylandweight <- select(mtcars, cyl, wt)
summary(justcylandweight)
cyl wt
Min. :4.000 Min. :1.513
1st Qu.:4.000 1st Qu.:2.581
Median :6.000 Median :3.325
Mean :6.188 Mean :3.217
3rd Qu.:8.000 3rd Qu.:3.610
Max. :8.000 Max. :5.424
Excluding columns
If you want to keep most of the columns — perhaps you just want to get rid of
one and keep the rest — put a minus (-
) sign in front of the name of the
column to drop. This then selects everything except the column you named:
# Note we are just dropping the Ozone column
head(select(airquality, -Ozone))
Solar.R Wind Temp Month Day
1 190 7.4 67 5 1
2 118 8.0 72 5 2
3 149 12.6 74 5 3
4 313 11.5 62 5 4
5 NA 14.3 56 5 5
6 NA 14.9 66 5 6
Matching specific columns
You can use a patterns to match a subset of the columns you want. For example,
here we select all the columns where the name contains the letter d
:
head(select(mtcars, contains("d")))
disp drat
Mazda RX4 160 3.90
Mazda RX4 Wag 160 3.90
Datsun 710 108 3.85
Hornet 4 Drive 258 3.08
Hornet Sportabout 360 3.15
Valiant 225 2.76
And you can combine these techniques to make more complex selections:
head(select(mtcars, contains("d"), -drat))
disp
Mazda RX4 160
Mazda RX4 Wag 160
Datsun 710 108
Hornet 4 Drive 258
Hornet Sportabout 360
Valiant 225
Other methods of selection
As a quick reference, you can use the following ‘verbs’ to select columns in different ways:
starts_with()
ends_with()
contains()
everything()
See the help files for more information (type ??dplyr::select
into the
console).