# R code Chapter 1

This document contains abridged sections from Discovering Statistics Using R and RStudio by Andy Field so there are some copyright considerations. You can use this material for teaching and non-profit activities but please do not meddle with it or claim it as your own work. See the full license terms at the bottom of the page.

# Make sure to load this packages
library(tidyverse)


## Functions and objects

metallica <- c("Lars","James","Jason", "Kirk")
metallica <- metallica[metallica != "Jason"]
metallica <- c(metallica, "Rob")

print(metallica)

## [1] "Lars"  "James" "Kirk"  "Rob"

metallica %>% print(.)

## [1] "Lars"  "James" "Kirk"  "Rob"


## R markdown

### Basic text formatting

• Italic (*): Make text *italic* by placing it between asterisks (with no spaces) will knit as: Make text italic by placing it between asterisks (with no spaces)
• Bold (**): Make text **bold** by placing it between asterisks (with no spaces) will knit as: Make text bold by placing it between asterisks (with no spaces)
• Superscript (^^): Make text^superscript^ by placing it between carats (with no spaces) will knit as: Make text^superscript^ by placing it between asterisks (with no spaces)
• Subscript (~~): Make text~subscript~ by placing it between tildes (with no spaces) will knit as: Make text~subscript~ by placing it between tildes (with no spaces)
• Footnote [^\1]: A footnote[^1] and a second footnote[^2] will knit as: A footnote1 and a second footnote2

You can use hashes to make headings of different levels. For example:

# Level 1 heading


will knit as:

### Bullet lists

Single bullet lists, this:

* This is the first bullet
* This is the second
* This is the third


will knit as:

• This is the first bullet
• This is the second
• This is the third

Numbered bullet lists, this:

1. This is the first entry
2. This is the second
3. This is the third


will knit as:

1. This is the first entry
2. This is the second
3. This is the third

Complex lists. This:

* This is the first bullet point
+ this is a sub-bullet
+ so is this
* This is the second bullet
+ This is a sub-bullet
- I've gone crazy and done a third level of bullets
- It had to be done
* and this is the third


will knit as:

• This is the first bullet point
• this is a sub-bullet
• so is this
• This is the second bullet
• This is a sub-bullet
• I’ve gone crazy and done a third level of bullets
• It had to be done
• and this is the third

You use [text do display](web address) to insert hyperlinks. For example:

My favourite band is [Iron Maiden](https://ironmaiden.com/) will knit as:

My favourite band is Iron Maiden

### Images

You use ![Image caption (optional)](path to image) to insert images. For example:

![Figure 1: I love my spaniel](andy_milton.png) knits as:

### Tables

Insert tables using raw text. | denotes a column and the colon position denotes the alignment of the column, for example |---:| is right justified, |:---| is left justified, and |:---:| is centred.

: My top 3 Iron Maiden albums

| Name                   | Year   | Cover rating | Favourite track |
|:-----------------------|:----:|------:|:--------------------------------:|
|Piece of mind           | 1983 | ****  | The Flight of Icarus             |
|The Number of the beast | 1982 | ****  | Children of the damned           |
|Powerslave              | 1984 | ***** | The rime of the ancient mariner  |

knits as:

My top 3 Iron Maiden albums

NameYearCover ratingFavourite track
“Piece of mind”“1983”“****”“The Flight of Icarus”
“The Number of the beast”“1982”“****”“Children of the damned”
“Powerslave”“1984”“*****”“The rime of the ancient mariner”

Alternative, put your data in a tibble (more on this later) and use knitr::kable():

tibble::tribble(
~Name, ~Year, ~Cover rating, ~Favourite track,
"Piece of mind", "1983",  "****", "The Flight of Icarus",
"The Number of the beast", "1982", "****", "Children of the damned",
"Powerslave", "1984", "*****", "The rime of the ancient mariner"
) %>%
knitr::kable(caption = "My top 3 Iron Maiden albums")


will knit as:

tibble::tribble(
~Name, ~Year, ~Cover rating, ~Favourite track,
"Piece of mind", "1983",  "****", "The Flight of Icarus",
"The Number of the beast", "1982", "****", "Children of the damned",
"Powerslave", "1984", "*****", "The rime of the ancient mariner"
) %>%
knitr::kable(caption = "My top 3 Iron Maiden albums")


Table: Table 1: My top 3 Iron Maiden albums

NameYearCover ratingFavourite track
Piece of mind1983****The Flight of Icarus
The Number of the beast1982****Children of the damned
Powerslave1984*****The rime of the ancient mariner

### Equations

You can include equations in R markdown using latex commands. You include an equation by enclosing it within $$ (or a single  if you want the equation within the line of text you’re writing). To give you a flavour: We can include the linear model in a sentence like this: (Y_i = b_0 + b_1X_i + \epsilon_i) which will knit as: We can include the linear model in a sentence like this: $$Y_i = b_0 + b_1X_i + \epsilon_i$$ Or, if we want it within its own paragraph we’d write it as this: $$
Y_i = b_0 + b_1X_i + \epsilon_i
$$ which knits as:$$ Y_i = b_0 + b_1X_i + \epsilon_i 

## Tidyverse and the pipe operator (%>%)

# library(tidyverse) or library(magrittr) to access the pipe

core_members <- metallica %>%
subset(., metallica != "Rob") %>%
sort(.)

core_members

## [1] "James" "Kirk"  "Lars"


## Getting data into R

name <- c("Lars Ulrich","James Hetfield", "Kirk Hammett", "Rob Trujillo", "Jason Newsted", "Cliff Burton", "Dave Mustaine")

# Numeric variables stored as double
songs_written <-  c(111, 112, 56, 16, 3, 11, 6)
net_worth <- c(300000000, 300000000, 200000000, 20000000, 40000000, 1000000, 20000000)

# Numeric variables stored as integer
songs_written_int <-  c(111L, 112L, 56L, 16L, 3L, 11L, 6L)
net_worth_int <- c(300000000L, 300000000L, 200000000L, 20000000L, 40000000L, 1000000L, 20000000L)

# Date variables
birth_date <- c("1963-12-26", "1963-08-03", "1962-11-18", "1964-10-23", "1963-03-04", "1962-02-10", "1961-09-13") %>% lubridate::ymd()

death_date <- c(NA, NA, NA, NA, NA, "1986-09-27", NA) %>%
lubridate::ymd()

# Logical variables
current_member <- c(TRUE, TRUE, TRUE, TRUE, FALSE, FALSE, FALSE)

# Factor variables
instrument <- c(2, 0, 0, 1, 1, 1, 0) %>% factor(levels = 0:2, labels = c("Guitar", "Bass", "Drums"))

instrument <- c("Drums", "Guitar", "Guitar", "Bass", "Bass", "Bass", "Guitar") %>%
forcats::as_factor() %>%
forcats::fct_relevel("Guitar", "Bass", "Drums")

levels(instrument)

## [1] "Guitar" "Bass"   "Drums"

levels(instrument) <- c("Proper guitar", "Bass guitar", "Drums")


## Tibbles and data frames

### Creating data frames

metalli_dat <- data.frame(name, birth_date, death_date, instrument, current_member, songs_written, net_worth)


### Creating tibbles

metalli_tib <- tibble::tibble(name, birth_date, death_date, instrument, current_member, songs_written, net_worth)


### Viewing dataframes and tibbles

metalli_tib

## # A tibble: 7 x 7
##   name   birth_date death_date instrument current_member songs_written net_worth
##   <chr>  <date>     <date>     <fct>      <lgl>                  <dbl>     <dbl>
## 1 Lars … 1963-12-26 NA         Drums      TRUE                     111 300000000
## 2 James… 1963-08-03 NA         Proper gu… TRUE                     112 300000000
## 3 Kirk … 1962-11-18 NA         Proper gu… TRUE                      56 200000000
## 4 Rob T… 1964-10-23 NA         Bass guit… TRUE                      16  20000000
## 5 Jason… 1963-03-04 NA         Bass guit… FALSE                      3  40000000
## 6 Cliff… 1962-02-10 1986-09-27 Bass guit… FALSE                     11   1000000
## 7 Dave … 1961-09-13 NA         Proper gu… FALSE                      6  20000000

# View(metalli_tib)


### Creating an empty tibble

#to create an empty tibble called empty_tib that has 50 rows, execute:

empty_tib <- tibble::tibble(.rows = 10)


Using base R:

metalli_tib$albums <- c(10, 10, 10, 2, 4, 3, 0)  Using dplyrr:mutate(): metalli_tib <- metalli_tib %>% dplyr::mutate( albums = c(10, 10, 10, 2, 4, 3, 0) ) metalli_tib  ## # A tibble: 7 x 8 ## name birth_date death_date instrument current_member songs_written net_worth ## <chr> <date> <date> <fct> <lgl> <dbl> <dbl> ## 1 Lars… 1963-12-26 NA Drums TRUE 111 300000000 ## 2 Jame… 1963-08-03 NA Proper gu… TRUE 112 300000000 ## 3 Kirk… 1962-11-18 NA Proper gu… TRUE 56 200000000 ## 4 Rob … 1964-10-23 NA Bass guit… TRUE 16 20000000 ## 5 Jaso… 1963-03-04 NA Bass guit… FALSE 3 40000000 ## 6 Clif… 1962-02-10 1986-09-27 Bass guit… FALSE 11 1000000 ## 7 Dave… 1961-09-13 NA Proper gu… FALSE 6 20000000 ## # … with 1 more variable: albums <dbl>  You can create your data set by initializing a tibble and then defining each variable. Note that in this context we use = rather than <- to assign values to each variable, and that each variable definition ends with a comma (except the last). For example, we can create metalli_tib from scratch as follows: metalli_tib <- tibble::tibble( name = c("Lars Ulrich","James Hetfield", "Kirk Hammett", "Rob Trujillo", "Jason Newsted", "Cliff Burton", "Dave Mustaine"), birth_date = c("1963-12-26", "1963-08-03", "1962-11-18", "1964-10-23", "1963-03-04", "1962-02-10", "1961-09-13") %>% lubridate::ymd(), death_date = c(NA, NA, NA, NA, NA, "1986-09-27", NA) %>% lubridate::ymd(), instrument = c("Drums", "Guitar", "Guitar", "Bass", "Bass", "Bass", "Guitar") %>% forcats::as_factor() %>% forcats::fct_relevel("Guitar", "Bass", "Drums"), current_member = c(TRUE, TRUE, TRUE, TRUE, FALSE, FALSE, FALSE), songs_written = c(111, 112, 56, 16, 3, 11, 6), net_worth = c(300000000, 300000000, 200000000, 20000000, 40000000, 1000000, 20000000) )  and now add one that quantifies how much they have earned per song that they contributed to: metalli_tib <- metalli_tib %>% dplyr::mutate( worth_per_song = net_worth/songs_written ) metalli_tib  ## # A tibble: 7 x 8 ## name birth_date death_date instrument current_member songs_written net_worth ## <chr> <date> <date> <fct> <lgl> <dbl> <dbl> ## 1 Lars… 1963-12-26 NA Drums TRUE 111 300000000 ## 2 Jame… 1963-08-03 NA Guitar TRUE 112 300000000 ## 3 Kirk… 1962-11-18 NA Guitar TRUE 56 200000000 ## 4 Rob … 1964-10-23 NA Bass TRUE 16 20000000 ## 5 Jaso… 1963-03-04 NA Bass FALSE 3 40000000 ## 6 Clif… 1962-02-10 1986-09-27 Bass FALSE 11 1000000 ## 7 Dave… 1961-09-13 NA Guitar FALSE 6 20000000 ## # … with 1 more variable: worth_per_song <dbl>  ### Entering data directly into a tibble You can enter the data directly as a tibble (rather than creating an empty one and using dplyr::mutate(): metalli_tib <- tibble::tribble( ~name, ~birth_date, ~death_date, ~instrument, ~current_member, ~songs_written, ~net_worth, "Lars Ulrich", "1963-12-26", NA, "Drums", TRUE, 111, 300000000, "James Hetfield", "1963-08-03", NA, "Guitar", TRUE, 112, 300000000, "Kirk Hammett", "1962-11-18", NA, "Guitar", TRUE, 56, 200000000, "Rob Trujillo", "1964-10-23", NA, "Bass", TRUE, 16, 20000000, "Jason Newsted", "1963-03-04", NA, "Bass", FALSE, 3, 40000000, "Cliff Burton", "1962-02-10", "1986-09-27", "Bass", FALSE, 11, 1000000, "Dave Mustaine", "1961-09-13", NA, "Guitar", FALSE, 6, 20000000 ) %>% dplyr::mutate( birth_date = lubridate::ymd(birth_date), death_date = lubridate::ymd(death_date) )  ### Finding a cell of a data frame or tibble Three ways to discover which instrument Lars Ulrich ‘plays’: metalli_tib[1, 4]  ## # A tibble: 1 x 1 ## instrument ## <chr> ## 1 Drums  metalli_tib[1, "instrument"]  ## # A tibble: 1 x 1 ## instrument ## <chr> ## 1 Drums  metalli_tib[name == "Lars Ulrich", "instrument"]  ## # A tibble: 1 x 1 ## instrument ## <chr> ## 1 Drums  ### Selecting variables Using base R # These commands return the contents of the variable called 'name' metalli_tib$name

## [1] "Lars Ulrich"    "James Hetfield" "Kirk Hammett"   "Rob Trujillo"
## [5] "Jason Newsted"  "Cliff Burton"   "Dave Mustaine"

metalli_tib[1]

## # A tibble: 7 x 1
##   name
##   <chr>
## 1 Lars Ulrich
## 2 James Hetfield
## 3 Kirk Hammett
## 4 Rob Trujillo
## 5 Jason Newsted
## 6 Cliff Burton
## 7 Dave Mustaine

metalli_tib["name"]

## # A tibble: 7 x 1
##   name
##   <chr>
## 1 Lars Ulrich
## 2 James Hetfield
## 3 Kirk Hammett
## 4 Rob Trujillo
## 5 Jason Newsted
## 6 Cliff Burton
## 7 Dave Mustaine

# Both of these commands return the contents of the variables called 'name' and instrument
metalli_tib[c(1, 4)]

## # A tibble: 7 x 2
##   name           instrument
##   <chr>          <chr>
## 1 Lars Ulrich    Drums
## 2 James Hetfield Guitar
## 3 Kirk Hammett   Guitar
## 4 Rob Trujillo   Bass
## 5 Jason Newsted  Bass
## 6 Cliff Burton   Bass
## 7 Dave Mustaine  Guitar

metalli_tib[c("name", "instrument")]

## # A tibble: 7 x 2
##   name           instrument
##   <chr>          <chr>
## 1 Lars Ulrich    Drums
## 2 James Hetfield Guitar
## 3 Kirk Hammett   Guitar
## 4 Rob Trujillo   Bass
## 5 Jason Newsted  Bass
## 6 Cliff Burton   Bass
## 7 Dave Mustaine  Guitar


The tidyverse way

metalli_tib %>%
dplyr::select(name, instrument)

## # A tibble: 7 x 2
##   name           instrument
##   <chr>          <chr>
## 1 Lars Ulrich    Drums
## 2 James Hetfield Guitar
## 3 Kirk Hammett   Guitar
## 4 Rob Trujillo   Bass
## 5 Jason Newsted  Bass
## 6 Cliff Burton   Bass
## 7 Dave Mustaine  Guitar


You can exclude variables too, try these out:

metalli_tib %>%
dplyr::select(-name)

## # A tibble: 7 x 6
##   birth_date death_date instrument current_member songs_written net_worth
##   <date>     <date>     <chr>      <lgl>                  <dbl>     <dbl>
## 1 1963-12-26 NA         Drums      TRUE                     111 300000000
## 2 1963-08-03 NA         Guitar     TRUE                     112 300000000
## 3 1962-11-18 NA         Guitar     TRUE                      56 200000000
## 4 1964-10-23 NA         Bass       TRUE                      16  20000000
## 5 1963-03-04 NA         Bass       FALSE                      3  40000000
## 6 1962-02-10 1986-09-27 Bass       FALSE                     11   1000000
## 7 1961-09-13 NA         Guitar     FALSE                      6  20000000

metalli_tib %>%
dplyr::select(-c(name, instrument))

## # A tibble: 7 x 5
##   birth_date death_date current_member songs_written net_worth
##   <date>     <date>     <lgl>                  <dbl>     <dbl>
## 1 1963-12-26 NA         TRUE                     111 300000000
## 2 1963-08-03 NA         TRUE                     112 300000000
## 3 1962-11-18 NA         TRUE                      56 200000000
## 4 1964-10-23 NA         TRUE                      16  20000000
## 5 1963-03-04 NA         FALSE                      3  40000000
## 6 1962-02-10 1986-09-27 FALSE                     11   1000000
## 7 1961-09-13 NA         FALSE                      6  20000000


You can save a subsetted tibble to a new object:

# Save a version of metalli_tib but exclude the variable called name

metalli_anon_tib <- metalli_tib %>%
dplyr::select(-name)

#View this new object
metalli_anon_tib

## # A tibble: 7 x 6
##   birth_date death_date instrument current_member songs_written net_worth
##   <date>     <date>     <chr>      <lgl>                  <dbl>     <dbl>
## 1 1963-12-26 NA         Drums      TRUE                     111 300000000
## 2 1963-08-03 NA         Guitar     TRUE                     112 300000000
## 3 1962-11-18 NA         Guitar     TRUE                      56 200000000
## 4 1964-10-23 NA         Bass       TRUE                      16  20000000
## 5 1963-03-04 NA         Bass       FALSE                      3  40000000
## 6 1962-02-10 1986-09-27 Bass       FALSE                     11   1000000
## 7 1961-09-13 NA         Guitar     FALSE                      6  20000000


### Selecting cases (filtering tibbles)

Using base R we can do the following. View only the data for the current members of metallica:

metalli_tib[current_member == TRUE,]

## # A tibble: 4 x 7
##   name   birth_date death_date instrument current_member songs_written net_worth
##   <chr>  <date>     <date>     <chr>      <lgl>                  <dbl>     <dbl>
## 1 Lars … 1963-12-26 NA         Drums      TRUE                     111 300000000
## 2 James… 1963-08-03 NA         Guitar     TRUE                     112 300000000
## 3 Kirk … 1962-11-18 NA         Guitar     TRUE                      56 200000000
## 4 Rob T… 1964-10-23 NA         Bass       TRUE                      16  20000000


View only the instruments played by the current members of metallica:

metalli_tib[current_member == TRUE, "instrument"]

## # A tibble: 4 x 1
##   instrument
##   <chr>
## 1 Drums
## 2 Guitar
## 3 Guitar
## 4 Bass


View only the names, instruments played, and number of songs written by the current members of metallica:

metalli_tib[current_member == TRUE, c("name", "instrument", "songs_written")]

## # A tibble: 4 x 3
##   name           instrument songs_written
##   <chr>          <chr>              <dbl>
## 1 Lars Ulrich    Drums                111
## 2 James Hetfield Guitar               112
## 3 Kirk Hammett   Guitar                56
## 4 Rob Trujillo   Bass                  16

metalli_tib[songs_written > 50,]

## # A tibble: 3 x 7
##   name   birth_date death_date instrument current_member songs_written net_worth
##   <chr>  <date>     <date>     <chr>      <lgl>                  <dbl>     <dbl>
## 1 Lars … 1963-12-26 NA         Drums      TRUE                     111 300000000
## 2 James… 1963-08-03 NA         Guitar     TRUE                     112 300000000
## 3 Kirk … 1962-11-18 NA         Guitar     TRUE                      56 200000000


Using dplyr::filter()

dplyr::filter(metalli_tib, current_member == TRUE)

## # A tibble: 4 x 7
##   name   birth_date death_date instrument current_member songs_written net_worth
##   <chr>  <date>     <date>     <chr>      <lgl>                  <dbl>     <dbl>
## 1 Lars … 1963-12-26 NA         Drums      TRUE                     111 300000000
## 2 James… 1963-08-03 NA         Guitar     TRUE                     112 300000000
## 3 Kirk … 1962-11-18 NA         Guitar     TRUE                      56 200000000
## 4 Rob T… 1964-10-23 NA         Bass       TRUE                      16  20000000

# Or using a pipe:

metalli_tib %>%
dplyr::filter(current_member == TRUE)

## # A tibble: 4 x 7
##   name   birth_date death_date instrument current_member songs_written net_worth
##   <chr>  <date>     <date>     <chr>      <lgl>                  <dbl>     <dbl>
## 1 Lars … 1963-12-26 NA         Drums      TRUE                     111 300000000
## 2 James… 1963-08-03 NA         Guitar     TRUE                     112 300000000
## 3 Kirk … 1962-11-18 NA         Guitar     TRUE                      56 200000000
## 4 Rob T… 1964-10-23 NA         Bass       TRUE                      16  20000000


Again, we can save the filtered tibble to a new object:

metallica_current <- metalli_tib %>%
dplyr::filter(current_member == TRUE)

metallica_current

## # A tibble: 4 x 7
##   name   birth_date death_date instrument current_member songs_written net_worth
##   <chr>  <date>     <date>     <chr>      <lgl>                  <dbl>     <dbl>
## 1 Lars … 1963-12-26 NA         Drums      TRUE                     111 300000000
## 2 James… 1963-08-03 NA         Guitar     TRUE                     112 300000000
## 3 Kirk … 1962-11-18 NA         Guitar     TRUE                      56 200000000
## 4 Rob T… 1964-10-23 NA         Bass       TRUE                      16  20000000


Combining conditions:

metalli_tib %>%
dplyr::filter(is.na(death_date) & instrument == "Bass guitar")

## # A tibble: 0 x 7
## # … with 7 variables: name <chr>, birth_date <date>, death_date <date>,
## #   instrument <chr>, current_member <lgl>, songs_written <dbl>,
## #   net_worth <dbl>


If we change is.na() to !is.na() we get Cliff Burton’s data (the only member who is a bassist and does NOT have a value of ‘NA’ for the variable death_date):

metalli_tib %>%
dplyr::filter(!is.na(death_date) & instrument == "Bass guitar")

## # A tibble: 0 x 7
## # … with 7 variables: name <chr>, birth_date <date>, death_date <date>,
## #   instrument <chr>, current_member <lgl>, songs_written <dbl>,
## #   net_worth <dbl>


We can also use the OR operator (|) to set conditions for which only one has to be true. For example, to select members who are either bassists OR drummers we’d use:

metalli_tib %>%
dplyr::filter(instrument == "Drums" | instrument == "Bass guitar")

## # A tibble: 1 x 7
##   name   birth_date death_date instrument current_member songs_written net_worth
##   <chr>  <date>     <date>     <chr>      <lgl>                  <dbl>     <dbl>
## 1 Lars … 1963-12-26 NA         Drums      TRUE                     111 300000000


### Combining selecting cases with selecting variables

This command filters metalli_tib according to whether the variable current_member is equal to TRUE and whether the variable instrument is NOT equal to (!=) the phrase “Bass guitar”:

metalli_worth  <- metalli_tib %>%
dplyr::filter(current_member == TRUE & instrument != "Bass guitar")

metalli_worth

## # A tibble: 4 x 7
##   name   birth_date death_date instrument current_member songs_written net_worth
##   <chr>  <date>     <date>     <chr>      <lgl>                  <dbl>     <dbl>
## 1 Lars … 1963-12-26 NA         Drums      TRUE                     111 300000000
## 2 James… 1963-08-03 NA         Guitar     TRUE                     112 300000000
## 3 Kirk … 1962-11-18 NA         Guitar     TRUE                      56 200000000
## 4 Rob T… 1964-10-23 NA         Bass       TRUE                      16  20000000


Having done this, we could pass the object net_worth object that we just created into the select function to select the variables name and net_worth:

metalli_worth  <- metalli_worth  %>%
dplyr::select(name, net_worth)

metalli_worth

## # A tibble: 4 x 2
##   name           net_worth
##   <chr>              <dbl>
## 1 Lars Ulrich    300000000
## 2 James Hetfield 300000000
## 3 Kirk Hammett   200000000
## 4 Rob Trujillo    20000000


Better still, combine the two operations into a single pipe:

metalli_worth <- metalli_tib %>%
dplyr::filter(current_member == TRUE & instrument != "Bass guitar") %>%
dplyr::select(name, net_worth)

metalli_worth

## # A tibble: 4 x 2
##   name           net_worth
##   <chr>              <dbl>
## 1 Lars Ulrich    300000000
## 2 James Hetfield 300000000
## 3 Kirk Hammett   200000000
## 4 Rob Trujillo    20000000


Doing the same with base R will make your eyes hurt

metalli_worth <- metalli_tib[current_member == TRUE & instrument != "Bass guitar", c("name", "net_worth")]
metalli_worth

## # A tibble: 3 x 2
##   name           net_worth
##   <chr>              <dbl>
## 1 Lars Ulrich    300000000
## 2 James Hetfield 300000000
## 3 Kirk Hammett   200000000


### Exporting data

readr::write_csv(metalli_tib, "../data/metallica.csv")

# or



## Using other software to get data in R

metalli_tib <- readr::read_csv("../data/metallica.csv")

# or



We can specify data types:

metalli_tib <- readr::read_csv("../data/metallica.csv", col_types = cols(
name = col_character(),
birth_date = col_date(),
death_date = col_date(),
instrument = col_factor(),
current_member = col_logical(),
songs_written = col_double(),
net_worth = col_double()
)
)

#OR

# OR

dplyr::mutate(
instrument = forcats::as_factor(instrument)
)

metalli_tib\$instrument


## Pieces of great

### Pieces of great 1.5

husband <- c("1973-06-21", "1970-07-16", "1949-10-08", "1969-05-24")
wife <- c("1984-11-12", "1973-08-02", "1948-11-11", "1983-07-23")
agegap <- husband-wife

husband <- c("1973-06-21", "1970-07-16", "1949-10-08", "1969-05-24") %>%
lubridate::ymd(.)
wife <- c("1984-11-12", "1973-08-02", "1948-11-11", "1983-07-23") %>%
lubridate::ymd(.)

agegap <- husband-wife
agegap


### Pieces of great 1.8

# Creates a list
metalli_lst <- list(name, instrument)
metalli_lst

## [[1]]
## [1] "Lars Ulrich"    "James Hetfield" "Kirk Hammett"   "Rob Trujillo"
## [5] "Jason Newsted"  "Cliff Burton"   "Dave Mustaine"
##
## [[2]]
## [1] Drums         Proper guitar Proper guitar Bass guitar   Bass guitar
## [6] Bass guitar   Proper guitar
## Levels: Proper guitar Bass guitar Drums

# creates a data frame using cbind
metalli_mtx <- cbind(name, instrument)
metalli_mtx

##      name             instrument
## [1,] "Lars Ulrich"    "3"
## [2,] "James Hetfield" "1"
## [3,] "Kirk Hammett"   "1"
## [4,] "Rob Trujillo"   "2"
## [5,] "Jason Newsted"  "2"
## [6,] "Cliff Burton"   "2"
## [7,] "Dave Mustaine"  "1"


1. Contents of my first footnote. ↩︎

2. Contents of my second footnote. ↩︎

Next