Just enough R
Introduction
I Getting started
1
Working with R
Installation
Workflow
RMarkdown
RStudio
Creating code chunks
First commands
Naming things
Vectors and lists
Vectors
Accessing elements
Selecting more than one element
Making and slicing with sequences
Conditional slicing
Working with vectors
Making new vectors
Making up data (new vectors)
Functions to learn now
Packages
II Data
2
The
dataframe
Working with dataframes
Introducing the
tidyverse
Selecting columns
Selecting rows
‘Operators’
Sorting
Pipes
Modifying and creating new columns
3
‘Real’ data
Importing data
Importing data over the web
Importing from SPSS and other packages
Saving and exporting
Use CSV files
Archiving, publication and sharing
Dealing with multiple files
Joining different datasets
Types of variable
Differences in
quantity
: numeric variables
Differences in
quality or kind
Factors for categorical data
Dates
Missing values
Tidying data
Reshaping
Which package should you use to reshape data?
Aggregating and reshaping at the same time
4
Summaries
A generalised approach
Fancy reshaping
5
Graphics
Benefits of visualising data
Which tool to use?
Layered graphics with
ggplot
A thought on ‘chart chooser’ guides
Thinking like
ggplot
‘Relationships’
‘Distributions’
‘Comparisons’
‘Composition’
‘Quick and dirty’ (utility) plots
5.0.1
Distributions
5.0.2
Relationships
5.0.3
Quantities
Tricks with ggplot
More ways to facet a plot
facet_wrap
facet_grid
Combining separate plots in a grid
Exporting for print
III Models
6
Commonly used statistics
6.1
Non-parametric statistics
Crosstabulations and
\(\chi^2\)
Three-way tables
Correlations
Creating a correlation matrix
Working with correlation matrices
Tables for publication
Other methods for correlation
t-tests
Visualising your data first
Running a t-test
7
Regression
Describing statistical models using formulae
Running a linear model
More on formulas
Factors and variable codings
Model specification
Effect/dummy coding and contrasts
Centering (is often helpful)
Scaling inputs
Alternatives to rescaling
What next
8
Anova
Rules for using Anova in R
Recommendations for doing Anova
Anova ‘Cookbook’
Between-subjects Anova
Repeated measures or ‘split plot’ designs
Traditional repeated measures Anova
Comparison with a multilevel model
Checking assumptions
Followup tests
9
Generalized linear models
Logistic regression
10
Multilevel models
Fitting multilevel models in R
Use
lmer
and
glmer
p
values in multilevel models
Extending traditional RM Anova
Fit a simple slope for
Days
Allow the effect of sleep deprivation to vary for different participants
Fitting a curve for the effect of
Days
Variance partition coefficients and intraclass correlations
3 level models with ‘partially crossed’ random effects
Contrasts and followup tests using
lmer
Troubleshooting
Convergence problems and simplifying the random effects structure
Bayesian multilevel models
11
Mediation and covariance modelling
Mediation
Mediation with multiple regression
Mediation Steps
Mediation example after Baron and Kenny
Testing the indirect effect
11.1
Mediation using Path models
Covariance modelling
Path models
Defining a model
Confirmatory factor analysis (CFA)
Latent variables
Defining a CFA model
CFA model fit
Modification indices
Model modification and improvement
Structural eqution modelling (SEM)
‘Identification’ in CFA and SEM
Missing data
Goodness of fit statistics in CFA
12
Baysian model fitting
Baysian fitting of linear models via MCMC methods
Posterior probabilities for parameters
Credible intervals
Bayesian ‘p values’ for parameters
13
Power analysis
For most inferential statistics
For multilevel or generalised linear models
IV Patterns
14
Learning key patterns
15
Unpicking interactions
What is an interaction?
Visualising interactions from raw data
A painful example
Continuous predictors
16
Making predictions
Predictions vs margins
Predicted means
Effects
(margins)
Continuous predictors
Predicted means and margins using
lm()
Running the model
Making predictions for means
Making prdictions for margins (
effects
of predictors)
Marginal effects
Predictions with continuous covariates
Visualising interactions
17
Models are data
Storing models in variables
Extracting results from models
‘Poking around’ with
$
and
@
Save time: use a
broom
‘Processing’ results
Printing tables
APA formatting for free
Chi
2
T-test
Anova
Multilevel models
Simplifying and re-using
Writing helper functions
Re-using code with
ggplot
“Table 1”
18
Dealing with quirks of R
Rownames are evil
Working with character strings
18.0.1
Searching and replacing
Using
paste
to make labels
Fixing up
variable
after melting
Colours
Picking colours for plots
Named colours in R
ColourBrewer with ggplot
19
Getting help
Finding the backtick on your keyboard
V Explanations
20
Confidence and Intervals
The problem with confidence intervals
Forgetting that the CI depends on sample size.
21
Multiple comparisons
p
values and ‘false discoveries’
Multiple tests on the same data
What to do about it?
Practical examples
22
Non-independence
23
Fixed and random effects
24
Scaling predictor variables
Standardising
Dichotomising continuous predictors (or outcomes)
25
Non-scale outcomes
25.1
Link functions
Logistic regression
26
Building and choosing models
Like maps, models are imperfect but useful
Overfitting/underfitting
Choosing the ‘right variables’
References
Just Enough R
11
Mediation and covariance modelling