POL90: Applied Quantitative Analysis

class: center, middle, inverse, title-slide

# POL90: Applied Quantitative Analysis
## Chapter 1
### Prof Wasow <br/> Assistant Professor, Politics <br/> Pomona College
### 2022-01-25

---

# Announcements

.large[

- Get R, RStudio, TinyTex / Latex running on your computer

]

.large[

- PS01 on Sakai, due *Friday*
  
     + If you're not enrolled, email me so we can add you to DataCamp
     
     + Can speed up videos
    
     + Complete at least 90% for full credit
]
--
.large[

- Read Statistical Sleuth, Chapter 1
  
]

---
# Ed Discussions is Live

---
# Schedule
<table>
 <thead>
  <tr>
   <th style="text-align:right;"> Week </th>
   <th style="text-align:left;"> Date </th>
   <th style="text-align:left;"> Day </th>
   <th style="text-align:left;"> Title </th>
   <th style="text-align:right;"> Chapter </th>
  </tr>
 </thead>
<tbody>
  <tr>
   <td style="text-align:right;"> 1 </td>
   <td style="text-align:left;"> Jan 17 </td>
   <td style="text-align:left;"> Mon </td>
   <td style="text-align:left;"> Introduction and Overview </td>
   <td style="text-align:right;"> - </td>
  </tr>
  <tr>
   <td style="text-align:right;"> 1 </td>
   <td style="text-align:left;"> Jan 19 </td>
   <td style="text-align:left;"> Wed </td>
   <td style="text-align:left;"> Introduction </td>
   <td style="text-align:right;"> - </td>
  </tr>
  <tr>
   <td style="text-align:right;color: black !important;background-color: yellow !important;"> 2 </td>
   <td style="text-align:left;color: black !important;background-color: yellow !important;"> Jan 24 </td>
   <td style="text-align:left;color: black !important;background-color: yellow !important;"> Mon </td>
   <td style="text-align:left;color: black !important;background-color: yellow !important;"> Drawing Statistical Conclusions </td>
   <td style="text-align:right;color: black !important;background-color: yellow !important;"> 1 </td>
  </tr>
  <tr>
   <td style="text-align:right;"> 2 </td>
   <td style="text-align:left;"> Jan 26 </td>
   <td style="text-align:left;"> Wed </td>
   <td style="text-align:left;"> Drawing Statistical Conclusions </td>
   <td style="text-align:right;"> 1 </td>
  </tr>
  <tr>
   <td style="text-align:right;"> 3 </td>
   <td style="text-align:left;"> Jan 31 </td>
   <td style="text-align:left;"> Mon </td>
   <td style="text-align:left;"> Inference Using t-Distributions </td>
   <td style="text-align:right;"> 2 </td>
  </tr>
  <tr>
   <td style="text-align:right;"> 3 </td>
   <td style="text-align:left;"> Feb 2 </td>
   <td style="text-align:left;"> Wed </td>
   <td style="text-align:left;"> Inference Using t-Distributions </td>
   <td style="text-align:right;"> 2 </td>
  </tr>
  <tr>
   <td style="text-align:right;"> 4 </td>
   <td style="text-align:left;"> Feb 7 </td>
   <td style="text-align:left;"> Mon </td>
   <td style="text-align:left;"> A Closer Look at Assumptions </td>
   <td style="text-align:right;"> 3 </td>
  </tr>
  <tr>
   <td style="text-align:right;"> 4 </td>
   <td style="text-align:left;"> Feb 9 </td>
   <td style="text-align:left;"> Wed </td>
   <td style="text-align:left;"> A Closer Look at Assumptions </td>
   <td style="text-align:right;"> 3 </td>
  </tr>
  <tr>
   <td style="text-align:right;"> 5 </td>
   <td style="text-align:left;"> Feb 14 </td>
   <td style="text-align:left;"> Mon </td>
   <td style="text-align:left;"> Alternatives to the t-Tools </td>
   <td style="text-align:right;"> 4 </td>
  </tr>
  <tr>
   <td style="text-align:right;"> 5 </td>
   <td style="text-align:left;"> Feb 16 </td>
   <td style="text-align:left;"> Wed </td>
   <td style="text-align:left;"> Alternatives to the t-Tools </td>
   <td style="text-align:right;"> 4 </td>
  </tr>
</tbody>
</table>

---
# Can we study creativity?

.large[
- Teresa Amabile, now a professor at Harvard Busines School, ran an experiment on effects of intrinsic and extrinsic motivation on creativity in 1985
]
---
# Amabile (1985), "Motivation and creativity"

---
# Why might we care about creativity?

.large[
 - Do grading systems promote creativity in students?
]

.large[
 - Do ranking systems and incentive awards increase productivity among employees? 
]

.large[
 - Do rewards and praise stimulate children to learn?
]  
---
# How can we study creativity?

.large[
 - Research design:

- Subjects with creative writing experience randomly assigned: 
	
	   - 24 to "Intrinsic"
	   - 23 to "Extrinsic"
]

.large[
	- After questionnaire, subjects asked to write a "Haiku about laughter"
]	
--

.large[
	- Poems submitted to 12 poets, who rated them on 40-point scale of creativity
]

.large[
	- Score is average of 12 judges (who did not know purpose of study)
]

---
# Example: Creativity study questions

<img src="images/creativity_study_questions.jpg" width="65%" style="display: block; margin: auto;" />
.footnote[Source: *Statistical Sleuth*, Display 1.2]

---
# Random sampling study with two populations
<br/><br/>
.center[![](images/ss_display_1_6.png)]
.footnote[Source: *Statistical Sleuth*, Display 1.6]

---
# Creativity study summary statistics
<br/>
<br/>
.center[![](images/creativity_study_summary_stats.jpg)]

.footnote[Source: *Statistical Sleuth*, Display 1.1]

---

# Loading Creativity Experiment Data

```r
library(Sleuth2)
library(janitor)

creativity <- Sleuth2::case0101

head(creativity)
```

```
##   Score Treatment
## 1   5.0 Extrinsic
## 2   5.4 Extrinsic
## 3   6.1 Extrinsic
## 4  10.9 Extrinsic
## 5  11.8 Extrinsic
## 6  12.0 Extrinsic
```

```r
creativity <- creativity %>% janitor::clean_names()

head(creativity, 2)
```

```
##   score treatment
## 1   5.0 Extrinsic
## 2   5.4 Extrinsic
```

---
## Creativity Summary Stats in Base R

```r
# Base R
intrin <- creativity[creativity$treatment == "Intrinsic", ]
head(intrin, 3)
```

```
##    score treatment
## 24  12.0 Intrinsic
## 25  12.0 Intrinsic
## 26  12.9 Intrinsic
```

```r
extrin <- creativity[creativity$treatment == "Extrinsic", ]

int_mean <- mean(intrin$score)
int_mean
```

```
## [1] 19.88333
```

```r
ext_mean <- mean(extrin$score)
ext_mean 
```

```
## [1] 15.73913
```

```r
int_mean - ext_mean
```

```
## [1] 4.144203
```

---
## Creativity Summary Stats in Tidyverse

```r
library(dplyr)
creativity %>%
    group_by(treatment)  
```

```
## # A tibble: 47 × 2
*## # Groups:   treatment [2]
##    score treatment
##    <dbl> <fct>    
##  1  5    Extrinsic
##  2  5.40 Extrinsic
##  3  6.10 Extrinsic
##  4 10.9  Extrinsic
##  5 11.8  Extrinsic
##  6 12    Extrinsic
##  7 12.3  Extrinsic
##  8 14.8  Extrinsic
##  9 15    Extrinsic
## 10 16.8  Extrinsic
## # … with 37 more rows
```

---
## Creativity Summary Stats in Tidyverse

```r
library(dplyr)
creativity_stats <- creativity %>%
    group_by(treatment) %>% 
    summarize(mean_score = mean(score),
              n          = n())

creativity_stats
```

```
## # A tibble: 2 × 3
##   treatment mean_score     n
##   <fct>          <dbl> <int>
## 1 Extrinsic       15.7    23
## 2 Intrinsic       19.9    24
```

```r
creativity_stats %>% 
    mutate(
      diff       = (mean_score[2] - mean_score[1])
    )
```

```
## # A tibble: 2 × 4
##   treatment mean_score     n  diff
##   <fct>          <dbl> <int> <dbl>
## 1 Extrinsic       15.7    23  4.14
## 2 Intrinsic       19.9    24  4.14
```

---
# Creativity study hypothesis test  
<br/>
<br/>
--

.large[
- Is a difference in means of 4.14 big or small?
]

.large[
- A `$p$`-value is a measure that helps us gauge whether a result is extreme
]

.large[
- `$p$`-value is the probability of getting a statistic as extreme as the observed statistic if the null hypothesis is true

- What kinds of statistics would we get, if the null hypothesis is true?
  
  - How extreme is the observed statistic?
]

---
class: center, middle, inverse

# Randomization Tests

---
# Test via randomization (Using simulation)

.large[
- Randomization test

- Simulate new statistics, assuming the null hypothesis were true
  
  - Find the proportion of simulated statistics as extreme or more as observed statistic
]

.center[![](images/ss_display_1_6.png)]
.footnote[Source: *Statistical Sleuth*, Display 1.6]

---

# Idea behind randomization distribution

.large[
 - If treatment had no effect, then observed outcomes are unrelated to whether subject was assigned to treatment or control group
 
 - Under assumption of "null hypothesis" or that treatment had no effect, we could shuffle all treatment and control assignments and recalculate difference-in-means
]

---
# Randomization distribution via StatKey
<img src="images/statkey_creativity.png" width="80%" style="display: block; margin: auto;" />

.footnote[Source: [http://www.lock5stat.com/StatKey/](http://www.lock5stat.com/StatKey/)]

---

# Interpretation of randomization distribution

.large[
 - These simulated results allow us to see if our observed result is extreme compared to other plausible samples of treatment and control groups
 
 - Each randomization is like a possible parallel universe (under assumption of no effect of treatment)
]

---
class: center, middle, inverse

# Randomization Tests: Exercise

---
## Exercise: Randomization Distribution with StatKey

.large[

+ Go to http://www.lock5stat.com/StatKey
  
    - You can just search for StatKey on Google
    
  + Click on "Test for Difference in Means"
  
  + Click on "Leniency and Smiles" for Pop-up menu

- Select "Mosquitos (Beer vs Water)
    
  + Play with randomizing assignment to the two conditions
  
]
---
## StatKey: Mosquitos (Beer vs Water)

---
## StatKey: Mosquitos (Beer vs Water)

---
## StatKey: Mosquitos (Beer vs Water)

---

# Questions?