Teaching R

A brief introduction

Kumar Ramanathan

@kumarhk

May 20, 2021

Warm up

Let’s do some polling! Go to pollev.com/kumarr436

Learning objectives for today

  1. Construct a lesson plan for an R workshop/session
  2. Understand the examples-and-exercises approach to teaching R
  3. Articulate questions about how to teach R workshops/sessions
  4. Identify resources to address those questions

Outline

  • Preparing to teach
  • Determining the scope of your lesson
  • Building your lesson (nuts-and-bolts)
  • Structuring your lesson (what goes after what?)
  • Examples and Exercises

Preparing to teach

Why should you learn to teach R?

  • Teaching is the best way to learn 🤓
  • Lots of opportunities to teach R workshops/etc at Northwestern 👨‍🏫
  • Build your teaching portfolio 📂
  • Training for certain non-academic career paths 💸

Questions to ask yourself

  • What are your learning objectives for the course/session/workshop?
  • What are the opportunities and constraints you’ll have in your teaching environment?
  • How much background do your students have? How much do they need?

Learning objectives

  • Much of this will be determined by context: the students may need specific skills (e.g. regression, data viz) for a class, you may be teaching general-purpose skills for data analysis, etc.
  • I suggest that every R workshop should share two objectives: Articulate questions about the [technique/method taught] and Identify resources to address those questions.
  • In simpler language: you want students to walk away knowing how to ask for help when they run into problems.

Teaching environment

  • One-off workshop vs. series
  • Virtual vs. in-person
  • Solo teaching vs. group teaching

Determining the scope of your lesson

Background knowledge

  • What do the students already know? You may be able to answer this from program structure, or through a survey.
  • Sometimes students will be coming in with different levels of background knowledge. You can try to address this before the session and/or adjust your teaching accordingly.

Do you need to cover the basics? What are the basics?

  • Installing R and RStudio
  • Navigating RStudio
  • “code”, “comments”, “objects”
  • Syntax and data types
  • Data structures
  • Reading and writing files

Before the lesson

  • Registration, where you can ask students to list skills & how comfortable they feel, their goals, etc.
  • Assigning materials for background knowledge before the session: pre-session videos, DataQuest, learnr videos
  • Communicate info with students: versions of R and RStudio needed, what packages are needed, location of shared materials, etc.

Building your lesson

Building your lesson

  • Befriend RMarkdown (and RProjects)
  • Consider how to store and share materials: Github, Box folder, something else, none of the above
  • Other types of tools: RStudio Cloud
  • Connect to other course materials, pre-workshop assignments, etc.

The GitHub option

Why GitHub?

For you:

  • Version control!
  • Good for collaboration
  • Good integration with RStudio, easy-to-use desktop application
  • Useful skill for data science & programming outside social science
  • Training via NUIT RCS; training within department coming soon

For students:

  • Easy download options
  • They can see how your code evolved
  • Helpful exposure to common tool in data science & programming

Structuring your lesson

Where to start

Motivation

  • Use the end point as motivation: show them what they will learn!
  • Pick some real data that the students are likely to be interested in

Help students feel comfortable

  • Remind them that they are learning a skill, which only comes with practice
  • Encourage your students to learn from each other

Core content

I usually outline the core content by:

  1. Drafting the end products and breaking down each step.
  2. Looking up existing teaching materials on the same/related topic.

I strongly suggest an examples and exercises approach to teaching skills in R. We’ll practice this in a moment.

Building flexibility into your lesson

  • Plan for a little bit more material than you can teach
  • Provide data files that students can play around with
  • For complex skills, give yourself wiggle room to skip over exercises based on timing/interest

Ending with encouragement

I always like to end with:

  1. Main takeaways
  2. Resources
  3. Reminder that they’ve just learned to create things!

Examples and Exercises

Learning by doing

Learning by doing

  • I usually have three components: slides, lecture notes, and exercises
  • Students follow along from the exercises file, with the lecture notes as reference
  • Commenting the exercises file is key: it helps students locate themselves, and makes it easier for them to come back to the materials and learn

Lesson snippets

Open the RProject file and look in the working directory: you will see an exercises subdirectory and an answers subdirectory.

The following lesson snippets all use .R code files for the exercises. You can also ask students to use .Rmd, especially if this is part of a course where you will need to collect assignment submissions.

As we go through, ask any questions you have about how to design and use examples/exercises.

Lesson snippet 1: ggplot and the grammar of graphics

Components of a basic plot

  • data: a data frame, provided to the ggplot() function
  • geometric objects: the objects/shapes that you want to plot, indicated through one of the many available geom functions, such as geom_point() or geom_hist()
  • aesthetic mapping: the mapping from the data to the geometric objects, provided in an aes() function nested within ggplot() or a geom function
  • connected with the + operator
ggplot(data = <DATA FRAME>) + 
  <GEOM_FUNCTION>(mapping = aes(<VARIABLES>))
ggplot(<DATA FRAME>) + 
  <GEOM_FUNCTION>(aes(<VARIABLES>))

Prepare data

# Load data
gapminder <- gapminder::gapminder

# Look at the structure of the data. You can use glimpse(), summary(), or head().
glimpse(gapminder)
## Rows: 1,704
## Columns: 6
## $ country   <fct> "Afghanistan", "Afghanistan", "Afghanistan", "Afghanistan", …
## $ continent <fct> Asia, Asia, Asia, Asia, Asia, Asia, Asia, Asia, Asia, Asia, …
## $ year      <int> 1952, 1957, 1962, 1967, 1972, 1977, 1982, 1987, 1992, 1997, …
## $ lifeExp   <dbl> 28.801, 30.332, 31.997, 34.020, 36.088, 38.438, 39.854, 40.8…
## $ pop       <int> 8425333, 9240934, 10267083, 11537966, 13079460, 14880372, 12…
## $ gdpPercap <dbl> 779.4453, 820.8530, 853.1007, 836.1971, 739.9811, 786.1134, …
# Create a new data frame with only the data for 2007
gapminder07 <- filter(gapminder, year==2007)

A basic scatterplot

ggplot(gapminder) + 
    geom_point(aes(x=year, y=pop))

A basic scatterplot

These will produce the same output:

ggplot(gapminder) + 
    geom_point(aes(x=year, y=pop))
ggplot(gapminder, aes(x=year, y=pop)) + 
    geom_point()

Add labels to the plot

ggplot(gapminder) + 
    geom_point(aes(x=year, y=pop)) + 
    labs(title="Population over time", x="Year", y="Population")

Your turn!

Plot life expectancy as a function of GDP per capita for the year 2007, and add labels.

  • Step 1: Supply the data gapminder07 to ggplot()
  • Step 2: Choose geom_point() + Supply x=gdpPercap and y=lifeExp to aes()
  • Step 3: Add title, x, and y in labs()

Your turn!

ggplot(gapminder07) + 
    geom_point(aes(x=gdpPercap, y=lifeExp)) + 
    labs(title="Do people in richer countries live longer?", x="GDP per capita", y="Life expectancy")

Choosing geoms

There are may geom functions we can choose to generate geometric objects: