Importing data
If you have data outside of R, the simplest way to import it is to first save
it as a comma or tab-separated text file, normally with the file extension
.csv
or .txt
6.
Let’s say we have file called angry_moods.csv
in the same directory as our
.Rmd
file. We can read this data using the read_csv()
function from the
readr
package7:
angry.moods <- readr::read_csv('data/angry_moods.csv')
head(angry.moods)
# A tibble: 6 x 7
Gender Sports Anger.Out Anger.In Control.Out Control.In Anger.Expression
<dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 2 1 18 13 23 20 36
2 2 1 14 17 25 24 30
3 2 1 13 14 28 28 19
4 2 1 17 24 23 23 43
5 1 1 16 17 26 28 27
6 1 1 16 22 25 23 38
As you can see, when loading the .csv
file the read_csv()
makes some
assumptions about the type of data the file contains. In this case, all the
columns contain integer values. It’s worth checking this message to make sure
that stray cells in the file you are importing don’t cause problems when
importing. Excel won’t complain about this sort of thing, but R is more strict
and won’t mix text and numbers in the same column.
A common error is for stray notes or text values in a spreadsheet to cause a
column which should be numeric to be converted to the character
type.
Once it’s loaded, you can use this new dataset like any other:
pairs(angry.moods)
Importing data over the web
One neat feature of the readr
package is that you can import data from the
web, using a URL rather than a filename on your local computer. This can be
really helpful when sharing data and code with colleagues. For example, we can
load the angry_moods.csv
file from a URL:
angry.moods.from.url <- readr::read_csv(
"https://raw.githubusercontent.com/benwhalley/just-enough-r/master/angry_moods.csv")
head(angry.moods.from.url)
Importing from SPSS and other packages
This is often more trouble than it’s worth. If using Excel for example, it’s best just to save your data a csv file first and import that.
But if you really must use other formats see https://www.datacamp.com/community/tutorials/r-data-import-tutorial.
This is easy to achieve in Excel and most other stats packages using the
Save As...
menu item↩There are also standard functions built into R, such as
read.csv()
orread.table()
for importing data. These are fine if you can’t install thereadr
package for some reason, but they are quite old and the default behaviour is sometimes counterintuitive. I recommend using thereadr
equivalents:read_csv()
orread_tsv()
.↩