Kable

This tutorial will be used to explain how you can make a table with the kable function. This type of table is specifically useful when you are trying to make a table for summary statistics with mean, median, number of terms, etc for your data. It is also useful to create a table with certain rows of your data or a summary table for a t-test.

If you do not have kable Extra yet, use install.packages to install. Then, we’ll download the package and go through an example. In order to access the example, use library to access the Sleuth3 package and put a meaningful name to ex0222. This example contains data documenting scores on the Armed Forces Qualifying Tests, which is a test for intelligence. This study was done to settle a lot of controversial and definitely wrong debates regarding the intelligence of women versus men. In particular, the test gives a score for arithmetic reasoning, word knowledge, paraphgraph comprehension, and mathmatical knowledge.

# set global options
knitr::opts_chunk$set(echo = TRUE)

# tidyverse packages
library(dplyr)  
library(broom)

# table packages
library(xtable)
library(kableExtra)

# load data
AFQT <- Sleuth3::ex0222
AFQT %>% glimpse() # look at data

Rows: 2,584
Columns: 6
$ Gender <fct> male, female, male, female, female, female, female, m…
$ Arith  <int> 19, 23, 30, 30, 13, 8, 10, 4, 12, 3, 30, 10, 10, 28, …
$ Word   <int> 27, 34, 35, 35, 30, 15, 17, 17, 33, 11, 33, 16, 16, 3…
$ Parag  <int> 14, 11, 14, 13, 11, 6, 6, 6, 13, 5, 15, 3, 11, 14, 5,…
$ Math   <int> 14, 20, 25, 21, 12, 4, 7, 6, 11, 6, 24, 7, 6, 18, 7, …
$ AFQT   <dbl> 70.3, 60.4, 98.3, 84.7, 44.5, 4.0, 11.8, 8.9, 44.7, 2…

Now, that we’ve downloaded and explored the data a little bit, let’s make a summary statistics table which gives the average for math, word, paragraph, and artithmeitc scores per gender using kable.

Creating summary statistics

To do this, first make a data frame for your summary statistics.

# silence noisy messages from summarize command
options(dplyr.summarise.inform = FALSE)

# calculate summary stats
summary_stats <- AFQT %>%
  group_by(Gender) %>%
  summarize(
    mean_Arith         = mean(Arith),
    mean_Word          = mean(Word),
    mean_Paragraph     = mean(Parag),
    mean_Math          = mean(Math),
    number_of_subjects = n()
  )

And now, let’s make the table! to do this use the data fram for summary stats you just made and use a pipe to carry to kable. in the kable function, format will refer to how it prints after knitting. If knitting to HTML, use “html and if to PDF use”latex” , caption lets you make a title and booktabs makes neaer when knitting to PDF. Then pipe the kable function to kable_styling to add nicer formation like making stripes and adjusting width

summary_stats %>%
  kable(
    format   = "html",
    caption  = "Test Scores Summary by Gender",
    booktabs = TRUE,
    digits   = 2
  ) %>%
  kable_styling(
    bootstrap_options = "striped",
    full_width        = FALSE
  )

Table 1: Test Scores Summary by Gender
Gender	mean_Arith	mean_Word	mean_Paragraph	mean_Math	number_of_subjects
female	17.5	26.6	11.5	13.8	1278
male	19.5	26.6	10.9	14.6	1306

Working with t-tests

One interesting question for researches was to see if there were signficantly different results between genders on sections of the tests. So let’s do t-tests for each section and make tables for each using kable using kable.

names(AFQT)

[1] "Gender" "Arith"  "Word"   "Parag"  "Math"   "AFQT"

t_arith <- t.test(Arith ~ Gender, data = AFQT)
t_Word  <- t.test(Word  ~ Gender, data = AFQT)
t_parag <- t.test(Parag ~ Gender, data = AFQT)
t_math  <- t.test(Math  ~ Gender, data = AFQT)

# Converts t.test object to data.frame
t_arith_df <- tidy(t_arith)  
t_word_df  <- tidy(t_Word)
t_parag_df <- tidy(t_parag)
t_math_df  <- tidy(t_math)


t_arith_df %>%
  dplyr::select(-method, -alternative) %>% # drop extra cols
  # rename to make names more understandable
  # enclosing column name in tick marks allows for spaces 
  rename(
    `Mean Group Female` = estimate1,
    `Mean Group Male`   = estimate2,
    `t-statistic`       = statistic,
    df                  = parameter
  ) %>%
  kable(
    format   = "html",
    caption  = "t-test for Arithmetic vs Gender",
    booktabs = TRUE,
    digits   = 2
  ) %>%
  kable_styling(
    bootstrap_options = "striped",
    full_width        = FALSE
  )

Table 2: t-test for Arithmetic vs Gender
estimate	Mean Group Female	Mean Group Male	t-statistic	p.value	df	conf.low	conf.high
-2.04	17.5	19.5	-7.31	0	2574	-2.58	-1.49

t_math_df %>%
  dplyr::select(-method, -alternative) %>% # drop extra cols
  # rename to make names more understandable
  # enclosing column name in tick marks allows for spaces 
  rename(
    `Mean Group Female` = estimate1,
    `Mean Group Male`   = estimate2,
    `t-statistic`       = statistic,
    df                  = parameter
  ) %>%
  kable(
    format   = "html",
    caption  = "t-test for Math vs Gender",
    booktabs = TRUE,
    digits   = 2
  ) %>%
  kable_styling(
    bootstrap_options = "striped",
    full_width = FALSE
  )

Table 2: t-test for Math vs Gender
estimate	Mean Group Female	Mean Group Male	t-statistic	p.value	df	conf.low	conf.high
-0.75	13.8	14.6	-3.05	0	2573	-1.24	-0.27

t_word_df %>%
  dplyr::select(-method, -alternative) %>% # drop cols
  # rename to make names more understandable
  # enclosing column name in tick marks allows for spaces 
  rename(
    `Mean Group Female` = estimate1,
    `Mean Group Male`   = estimate2,
    `t-statistic`       = statistic,
    df                  = parameter
  ) %>%
  kable(
    format   = "html",
    caption  = "t-test for Word vs Gender",
    booktabs = TRUE,
    digits   = 2
  ) %>%
  kable_styling(
    bootstrap_options = "striped",
    full_width        = FALSE
  )

Table 2: t-test for Word vs Gender
estimate	Mean Group Female	Mean Group Male	t-statistic	p.value	df	conf.low	conf.high
0.02	26.6	26.6	0.08	0.94	2581	-0.52	0.57

t_parag_df %>%
  dplyr::select(-method, -alternative) %>% # drop cols
  # renames to make names more understandable
  # enclosing column name in tick marks allows for spaces 
  rename(
    `Mean Group Female` = estimate1,
    `Mean Group Male`   = estimate2,
    `t-statistic`       = statistic,
    df                  = parameter
  ) %>%
  kable(
    format = "html",
    caption = "t-test for Paragraph vs Gender",
    booktabs = TRUE,
    digits   = 2
  ) %>%
  kable_styling(
    bootstrap_options = "striped",
    full_width = FALSE
  )

Table 2: t-test for Paragraph vs Gender
estimate	Mean Group Female	Mean Group Male	t-statistic	p.value	df	conf.low	conf.high
0.57	11.5	10.9	4.6	0	2562	0.33	0.81

What conclusions can we draw from these tests? Are there confounding factors that would limit these conclusions?

Lastly, let’s make a table that views the first five rows of the data. This is a good skill if you want to see a quick preview of data/explore it before doing anaylsis and show that exploration in a neat way. Can do this using slice which allows you to extract certain rows.

AFQT %>%
  slice(1:5) %>% # extracts 5 rows
  kable(
    format   = "html", # format = "latex" for pdfs
    caption  = "Some AFQT data",
    digits   = 2
  ) %>%
  kable_styling(full_width = FALSE)

Table 3: Some AFQT data
Gender	Arith	Word	Parag	Math	AFQT
male	19	27	14	14	70.3
female	23	34	11	20	60.4
male	30	35	14	25	98.3
female	30	35	13	21	84.7
female	13	30	11	12	44.5

xtable

xtable is another table style that prints some object as either a LaTeX or HTML table. In this tutorial, we will run through some sample code for the uses of ANOVA, as well as some tips and tricks for its usage.

Uses of xtable

xtable can print many R objects in a new object of class xtable. Two common examples of types of tables that can be produced with xtable are ANOVA tables and tables of whole data frames. In POL90, xtable is most commonly used for ANOVA tables.

Data frames as tables

First, let’s read in an example data frame that we can work with in our tables. The following data set shows the years of the Kentucky Derby, the winners, their average speed and track conditions between 1896-2011.

derby <- Sleuth3::ex0920 %>% janitor::clean_names()

To show how to use xtable to print the data frame as a table, we will start by using head(derby) to print just the first six rows of the data frame.

head(derby) %>% 
  xtable() %>%
  print(type = "html") # change to type = "latex" for PDF output

	year	winner	starters	net_to_winner	time	speed	track	conditions
1	1896	Ben Brush	8	4850	127.75	35.23	Dusty	Fast
2	1897	Typhoon II	6	4850	132.50	33.96	Heavy	Slow
3	1898	Plaudit	4	4850	129.00	34.88	Good	Fast
4	1899	Manuel	5	4850	132.00	34.09	Fast	Fast
5	1900	Lieut. Gibson	7	4850	126.25	35.64	Fast	Fast
6	1901	His Eminence	5	4850	127.75	35.23	Fast	Fast

There are two important things to remember here. First, in order to have the table print, it is necessary to place “results = ‘asis’” in the chunk header. Second, notice print(type = “html”). This can be changed to print(type = “latex”), depending on the output file type.

ANOVA Tables

Now we will discuss the more common type of table that will made with xtable: an ANOVA table. ANOVA tables are made to compare various models of relationships with data in order to find the model with the best fit. Suppose we have three linear regression models for our derby data, as shown below.

full     <- lm(data = derby, speed ~ year + track)
reduced  <- lm(data = derby, speed ~ year)
interact <- lm(data = derby, speed ~ year + track + year * track)

In order to compare the fits of these three models, we would make an ANOVA table with xtable, as is shown below. Here, an ANOVA object is being piped into xtable. Once again, don’t forget to include results = ‘asis’ to see the table when knitting.

anova(reduced, full, interact) %>%
  xtable() %>%
  print(type = "html") # change to type = "latex" for PDF output

	Res.Df	RSS	Df	Sum of Sq	F	Pr(>F)
1	114	41.84
2	108	21.37	6	20.46	17.03	0.0000
3	103	20.62	5	0.75	0.75	0.5893

Linear Regression

Regression outputs can also be visualized in xtable, as seen below with a lienar regression. The same concept applies to glm regressions as well. However, stargazer is most likely the better option in this case, as stargazer is better at producing a professional-looking regression table with the stars showing statistical significance.

full %>%
  xtable() %>%
  print(type = "html")

	Estimate	Std. Error	t value	Pr(>\|t\|)
(Intercept)	6.0947	2.6061	2.34	0.0212
year	0.0154	0.0014	11.35	0.0000
trackFast	0.3247	0.4556	0.71	0.4776
trackGood	0.0208	0.4739	0.04	0.9650
trackHeavy	-1.3254	0.4809	-2.76	0.0069
trackMuddy	-0.7660	0.4845	-1.58	0.1168
trackSloppy	-0.3726	0.5068	-0.74	0.4638
trackSlow	-0.3714	0.4895	-0.76	0.4496

xtable Tips & Tricks

Now that we have discussed what xtable can be used for and the barebones code of how to make a table, we will discuss other options to make our tables look just how we want them.

Titles

Titles for xtables objects can be made using the caption option.

anova(reduced, full, interact) %>%
  xtable(
    caption = "ANOVA table for Derby linear models"
    ) %>%
  print(type = "html") # change to type = "latex" for PDF output

ANOVA table for Derby linear models
	Res.Df	RSS	Df	Sum of Sq	F	Pr(>F)
1	114	41.84
2	108	21.37	6	20.46	17.03	0.0000
3	103	20.62	5	0.75	0.75	0.5893

Table Placement

Oftentimes, the position of the table itself floats. Use table.placement = “h” to fix this.

anova(reduced, full, interact) %>%
  xtable(
    caption         = "ANOVA table for Derby linear models", 
    table.placement = "h"
    ) %>%
  print(type        = "html") # type = "latex" for PDFs

ANOVA table for Derby linear models
	Res.Df	RSS	Df	Sum of Sq	F	Pr(>F)
1	114	41.84
2	108	21.37	6	20.46	17.03	0.0000
3	103	20.62	5	0.75	0.75	0.5893

If this doesn’t work on its own, you may need to add \usepackage{float} to the header of your r markdown file, as seen below:

title: "Example" 
date: "29 April 2019" 
  output: pdf_document: 
header-includes:
- \usepackage{float}

Suppressing Messages

Often, when knitting to a pdf with latex, there is a message that is produced that says “latex table generated in R 3.5.2 by xtable 1.8-3 package”. To correct this, after loading the xtable library, insert:

library(xtable)
options(xtable.comment = FALSE)

This supplement was put together by Amna Amin, Kavya Chaturvedi and Omar Wasow.

Table Guide (draft)