Class Meeting 10 Tibble Joins

Today’s topic is on operations with two or more tibbles.

10.1 Worksheet

You can find a worksheet template for today here.

10.2 Resources

For an overview of operations involving multiple tibbles, check out Jenny’s Chapter 14 in stat545.com.

For more activities, check out Rashedul’s guest lecture material from 2018.

10.3 Join Functions (25 min)

Often, we need to work with data living in more than one table. There are three main types of operations that can be done with two tables (as elaborated in r4ds Chapter 13 Introduction):

  • Mutating joins add new columns to the “original” tibble.
  • Filtering joins filter the “original” tibble’s rows.
  • Set operations work as if each row is an element in a set.
  • Binding stacks tables on top of or beside each other, with bind_rows() and bind_cols().

Let’s navigate to each of these three links, which lead to the relevant r4ds chapters, and go through the concepts there. These have excellent visuals to explain what’s going on.

Then, let’s go through Jenny’s join cheatsheet for examples.

10.4 Activity (25 min)

Let’s complete today’s worksheet.

In case you can’t download the singer package, just load the data by running these two lines

songs <- read_csv("https://raw.githubusercontent.com/STAT545-UBC/Classroom/master/data/singer/songs.csv")
locations <- read_csv("https://raw.githubusercontent.com/STAT545-UBC/Classroom/master/data/singer/loc.csv")

10.5 Time remaining?

Let’s return to the exercises from either:

  • tidyr last class
  • ggplot2 the class before