Processing math: 100%
+ - 0:00:00
Notes for current slide
Notes for next slide

POL90: Statistics

Multiple Regression: Quadratic Terms

Prof. Wasow, Politics
Pomona College

2022-03-29

1 / 38

Announcements

  • Assignments

    • PS07

    • Report 2

2 / 38

Schedule

Week Date Day Title Chapter
9 Mar 14 Mon Spring Recess -
9 Mar 16 Wed Spring Recess -
10 Mar 21 Mon Null hypothesis, R-squared 8
10 Mar 23 Wed Multiple regression 8
11 Mar 28 Mon Interaction terms 9
11 Mar 30 Wed Interaction terms 9
12 Apr 4 Mon Logistic regression 20
12 Apr 6 Wed Logistic regression 20
13 Apr 11 Mon Missing data Handout
13 Apr 13 Wed Missing data Handout
3 / 38

Assignment schedule

Week Date Day Assignment Percent
9 Mar 18 Fri Spring break NA
10 Mar 25 Fri PS07 3
11 Apr 1 Fri PS08 3
12 Apr 8 Fri Report2 8
13 Apr 15 Fri PS09 3
14 Apr 22 Fri PS10 3
15 Apr 29 Fri Report3 10
4 / 38

Report 1: Did Sandy Hook

Influcence Attitudes about

Gun Control?

5 / 38

Losielle, Cappella & Gleitz (2022)

6 / 38

Losielle, Cappella & Gleitz (2022)

7 / 38

Losielle, Cappella & Gleitz (2022)

8 / 38

Losielle, Cappella & Gleitz (2022)

9 / 38

Rosencrans, Unrath, Henriquez & Bhalla (2022)

10 / 38

Rosencrans, Unrath, Henriquez & Bhalla (2022)

11 / 38

Rosencrans, Unrath, Henriquez & Bhalla (2022)

12 / 38

Rosencrans, Unrath, Henriquez & Bhalla (2022)

13 / 38

Rosencrans, Unrath, Henriquez & Bhalla (2022)

14 / 38

Simple Regression Review

15 / 38

Regression Review: Intercept

  • Most basic regression

    • One variable: one y, no x term in regression

    • lm(y ~ 1, data = some_data)

    • Easy to conceptualize

    • Easy to plot in two-dimensions

16 / 38

Regression Review: Intercept + Dummy

  • Simple regression

    • Two variables: one y, one binary x

    • lm(y ~ x_binary, data = some_data)

    • Easy to conceptualize

    • Easy to plot in two-dimensions

17 / 38

Regression Review: Intercept + Slope

  • Simple regression

    • Two variables: one y, one continuous x

    • lm(y ~ x_continuous, data = some_data)

    • Easy to conceptualize

    • Easy to plot in two-dimensions

18 / 38

Regression Review: Intercept + Slope + Dummy

  • Simple regression

    • Three variables: one y, one continuous x and one binary x

    • lm(y ~ x_continuous + x_binary, data = some_data)

    • Easy to conceptualize

    • Easy to plot in two-dimensions

19 / 38

Regression with

Quadratic Terms

20 / 38

Regression Overview: Four Components




  1. Intercept

  2. Intercept shifts

  3. Slopes

  4. Slope shifts

21 / 38

Quadratic Model

Corn Yields

22 / 38

Corn yields vs rainfall

  • Data was collected on the corn yield versus rainfall in six U.S. corn-producing states (Iowa, Nebraska, Illinois, Indiana, Missouri, and Ohio), recorded for each year from 1890 to 1927.

  • Although increasing rainfall is associated with higher mean yields for rainfalls up to 12 inches, increasing rainfall at higher levels is associated with no change or perhaps a decrease in mean yield.

  • Why might that be?

23 / 38

Corn yields vs rainfall

Source: Statistical Sleuth, Display 9.6

24 / 38

Multiple Regression: Quadratic term

  • Multiple regression

    • Two variables: one y, one x with x + x2

    • lm(y ~ x + I(x*x), data = some_data)

    • Can still be plotted in two dimensions

25 / 38

Case: Corn Data

corn <- Sleuth3::ex0915 %>% clean_names()
head(corn, 15)
year yield rainfall
1 1890 24.5 9.6
2 1891 33.7 12.9
3 1892 27.9 9.9
4 1893 27.5 8.7
5 1894 21.7 6.8
6 1895 31.9 12.5
7 1896 36.8 13.0
8 1897 29.9 10.1
9 1898 30.2 10.1
10 1899 32.0 10.1
11 1900 34.0 10.8
12 1901 19.4 7.8
13 1902 36.0 16.2
14 1903 30.2 14.1
15 1904 32.4 10.6
26 / 38

Visualizing Corn Data

ggplot(data = corn) +
aes(x = rainfall, y = yield) +
geom_point() + geom_smooth(method = "loess")

27 / 38

Regression Table

lm1 <-lm(yield ~ rainfall, data = corn)
lm2 <-lm(yield ~ rainfall + I(rainfall^2), data = corn)
Dependent variable:
yield
(1)(2)
rainfall0.776** (0.294)6.004*** (2.039)
I(rainfall2)-0.229** (0.089)
Constant23.550*** (3.236)-5.015 (11.440)
Observations3838
R20.1620.297
Adjusted R20.1390.256
Note:*p<0.1; **p<0.05; ***p<0.01
28 / 38

Equation with rainfall squared

μ{yield|rainfall}=β0+β1(rainfall)+β2(rainfall2)

29 / 38

Equation with rainfall squared

μ{yield|rainfall}=β0+β1(rainfall)+β2(rainfall2)

μ{yield|rainfall}=5.015+6.004(rainfall)0.229(rainfall2)

29 / 38

Equation with rainfall squared

μ{yield|rainfall}=β0+β1(rainfall)+β2(rainfall2)

μ{yield|rainfall}=5.015+6.004(rainfall)0.229(rainfall2)

  • Example: rainfall = 9.6

μ{yield|rainfall}=5.015+6.004(9.6)0.229(9.62)

29 / 38

Equation with rainfall squared

μ{yield|rainfall}=β0+β1(rainfall)+β2(rainfall2)

μ{yield|rainfall}=5.015+6.004(rainfall)0.229(rainfall2)

  • Example: rainfall = 9.6

μ{yield|rainfall}=5.015+6.004(9.6)0.229(9.62)

μ{yield|rainfall}=5.015+57.63821.105

29 / 38

Equation with rainfall squared

μ{yield|rainfall}=β0+β1(rainfall)+β2(rainfall2)

μ{yield|rainfall}=5.015+6.004(rainfall)0.229(rainfall2)

  • Example: rainfall = 9.6

μ{yield|rainfall}=5.015+6.004(9.6)0.229(9.62)

μ{yield|rainfall}=5.015+57.63821.105

μ{yield|rainfall}=31.523

29 / 38

Visualizing rainfall = 9.6

plot_model(lm2, type = "pred", terms = "rainfall", show.data = TRUE) + geom_vline(xintercept = 9.6, col = "purple", linetype = "dashed")

30 / 38

Equation with rainfall = 12.9

μ{yield|rainfall}=β0+β1(rainfall)+β2(rainfall2)

μ{yield|rainfall}=5.015+6.004(rainfall)0.229(rainfall2)

31 / 38

Equation with rainfall = 12.9

μ{yield|rainfall}=β0+β1(rainfall)+β2(rainfall2)

μ{yield|rainfall}=5.015+6.004(rainfall)0.229(rainfall2)

  • Example: rainfall = 12.9

μ{yield|rainfall}=5.015+6.004(12.9)0.229(12.92)

31 / 38

Equation with rainfall = 12.9

μ{yield|rainfall}=β0+β1(rainfall)+β2(rainfall2)

μ{yield|rainfall}=5.015+6.004(rainfall)0.229(rainfall2)

  • Example: rainfall = 12.9

μ{yield|rainfall}=5.015+6.004(12.9)0.229(12.92)

μ{yield|rainfall}=5.015+77.45238.108

31 / 38

Equation with rainfall = 12.9

μ{yield|rainfall}=β0+β1(rainfall)+β2(rainfall2)

μ{yield|rainfall}=5.015+6.004(rainfall)0.229(rainfall2)

  • Example: rainfall = 12.9

μ{yield|rainfall}=5.015+6.004(12.9)0.229(12.92)

μ{yield|rainfall}=5.015+77.45238.108

μ{yield|rainfall}=34.329

31 / 38

Visualizing rainfall = 12.9

plot_model(lm2, type = "pred", terms = "rainfall", show.data = TRUE) + geom_vline(xintercept = 12.9, col = "purple", linetype = "dashed")

32 / 38

Equation with rainfall = 16.5

μ{yield|rainfall}=β0+β1(rainfall)+β2(rainfall2)

μ{yield|rainfall}=5.015+6.004(rainfall)0.229(rainfall2)

33 / 38

Equation with rainfall = 16.5

μ{yield|rainfall}=β0+β1(rainfall)+β2(rainfall2)

μ{yield|rainfall}=5.015+6.004(rainfall)0.229(rainfall2)

  • Example: rainfall = 16.5

μ{yield|rainfall}=5.015+6.004(16.5)0.229(16.52)

33 / 38

Equation with rainfall = 16.5

μ{yield|rainfall}=β0+β1(rainfall)+β2(rainfall2)

μ{yield|rainfall}=5.015+6.004(rainfall)0.229(rainfall2)

  • Example: rainfall = 16.5

μ{yield|rainfall}=5.015+6.004(16.5)0.229(16.52)

μ{yield|rainfall}=5.015+99.06662.345

33 / 38

Equation with rainfall = 16.5

μ{yield|rainfall}=β0+β1(rainfall)+β2(rainfall2)

μ{yield|rainfall}=5.015+6.004(rainfall)0.229(rainfall2)

  • Example: rainfall = 16.5

μ{yield|rainfall}=5.015+6.004(16.5)0.229(16.52)

μ{yield|rainfall}=5.015+99.06662.345

μ{yield|rainfall}=31.706

33 / 38

Visualizing rainfall = 16.5

plot_model(lm2, type = "pred", terms = "rainfall", show.data = TRUE) + geom_vline(xintercept = 16.5, col = "purple", linetype = "dashed")

34 / 38

What do we mean by “slope shift”?

# set up some plausible rainfall values
rainfall_values <- 7:16
rainfall_values
[1] 7 8 9 10 11 12 13 14 15 16
term1 <- 6.004 * rainfall_values
term1
[1] 42.03 48.03 54.04 60.04 66.04 72.05 78.05 84.06 90.06 96.06
term2 <- 0.229 * rainfall_values^2
term2
[1] 11.22 14.66 18.55 22.90 27.71 32.98 38.70 44.88 51.52 58.62
-5.015 + term1 - term2
[1] 25.79 28.36 30.47 32.12 33.32 34.06 34.34 34.16 33.52 32.42
35 / 38

interactions package

lm3 <- lm(yield ~ rainfall + I(rainfall^2) + year, data = corn)
interactions::interact_plot(
lm3, # pick a model to plot
pred = "rainfall", # this variable will be your x-axis
modx = "year", # a moderator, i.e. a control we think is important
plot.points = TRUE # plot the data points
)

36 / 38

When to include quadratic terms?

  • As with interaction terms, quadratic terms should not routinely be included.

  • Consider in four situations:

    • When the analyst has good reason to suspect that the response is nonlinear in some explanatory variable (through knowledge of the process or by graphical examination)

    • When the question of interest calls for finding the values that maximize or minimize the mean response;

    • When careful modeling of the regression is called for by the questions of interest (and presumably this is only the case if there are just a few explanatory variables);

    • Or when inclusion is used to produce a rich model for assessing the fit of an inferential model.

Statistical Sleuth 3e, 10.4.4, p 295

37 / 38

Questions?

38 / 38

Announcements

  • Assignments

    • PS07

    • Report 2

2 / 38
Paused

Help

Keyboard shortcuts

, , Pg Up, k Go to previous slide
, , Pg Dn, Space, j Go to next slide
Home Go to first slide
End Go to last slide
Number + Return Go to specific slide
b / m / f Toggle blackout / mirrored / fullscreen mode
c Clone slideshow
p Toggle presenter mode
t Restart the presentation timer
?, h Toggle this help
Esc Back to slideshow