Supplement to POL90
Typically, when you get an error loading data, the path to the file is incorrect. Below is some brief background and a handful of possible solutions.
When working interactively in R, the program has what is called a
“working directory.” This is where R expects to start looking for files.
When R sees a command like read.csv("my_data.csv")
, it
looks in the working directory and, if it doesn’t find it, will return
an error. To see your working directory, type getwd()
in
your CONSOLE and hit return.
Separate from the R working directory, every R Markdown file assumes
that the working directory is the folder in which that Rmd file is
stored. So, if "my_file.Rmd"
and "my_data.csv"
are in the same folder, then something like
read.csv("my_data.csv")
should work when knitting
but may not work interactively.
In lots of cases, though, our Rmd and data are not in the same
folder. Or, we’re working interactively and then R may use the R working
directory (which may be different from where the Rmd file is stored).
Again, you can check the R working directory by typing
getwd()
in your CONSOLE and hitting return.
To help R find the file, we can do one of several things:
A simple solution is to provide R with a file path that shows exactly where the file is stored on your computer.
Within R, one way to find the file path is to go to
the CONSOLE
area of RStudio and type
file.choose()
and hit enter/return. A window will pop-up
and, if you can find the file and select it, R will return the path to
that file. You then need to copy and paste that file path into your
read.csv()
or read.dta()
, etc. as in
anes <- read_dta("/Users/owasow/Research/anes/anes_timeseries_2020_gss_bridge_20220408.dta")
On a Mac, you can find the path to a file with the following simple steps:
Go to Finder and locate file on your computer. Click once on your file.
CLICK on the EDIT menu, hold down the OPTION key and select COPY “MY_FILE” AS PATHNAME
On Windows, try:
Go to Explorer and locate the file on your computer. Click once on your file.
SHIFT-CLICK on the file and select COPY PATH.
There are several downsides to this approach though.
First, if you move your file to a new folder, the path will break.
Second, if you are collaborating with others, each person will typically have a different hard coded file path that will need to be changed depending on who is working on the code.
If you create a new folder for each of your own projects (such as a problem set or a final), one approach is to manually change the R working directory to the relevant folder. Or, if you are working on a team and do not want to use hard coded paths, one solution is for each person to change their R working directory manually to point to the folder that contains the Rmd (and data).
An easy way to do this is to go to
RStudio -> SESSION menu -> SET WORKING DIRECTORY
and
then select one of the options.
If you have your relevant Rmd open, you can select
SET WORKING DIRECTORY -> TO SOURCE FILE LOCATION
If you don’t have your relevant Rmd open, you can select
SET WORKING DIRECTORY -> TO CHOOSE DIRECTORY
and then
manually pick the working directory
The main downside of this approach is that it requires manually setting the working directory a lot rather than something that works automatically. In a class where you have lots of assignments or multiple team projects, this can be cumbersome and prone to error. For example, your working directory will likely point to an old assignment folder every week.
RStudio has an option to create what it calls “R Projects” that
automatically set the working directory in whatever folder the R Project
resides. Creating an R Project is simple. Go to
RStudio -> FILE menu -> NEW PROJECT
If you need to create a NEW DIRECTORY where you want to do your work
(such as where your Rmd, data, etc. will go), choose
NEW DIRECTORY
If you already have a folder or directory where you want your work to
go, choose EXISTING DIRECTORY
Once you have an RStudio, project file created, there are two more simple steps:
First, when you want to open RStudio, DON’T directly open the application RStudio but, rather, go to the folder that has the R Project and open that (this will open RStudio with the relevant folder as the working directory). The icon for R Projects looks like a cube and, if you can see file extensions, will have .Rproj
Second, for any collaborative projects, I recommend you use an R package called here() that helps create file paths across different computers
library(here)
at the top
of your documenthere()
around the file name as
in:anes <- read_dta(here("anes_timeseries_2020_gss_bridge_20220408.dta"))
anes <- read_dta(here("anes_data/anes_timeseries_2020_gss_bridge_20220408.dta"))
For more on here and using a “Project Oriented Workflow” see Jenny Bryan’s post: https://www.tidyverse.org/blog/2017/12/workflow-vs-script/
More on “Project Oriented Workflow” from Jenny Bryan and Jim Hester: https://rstats.wtf/project-oriented-workflow.html
Also see Martin Chan’s “RStudio Projects and Working Directories: A Beginner’s Guide,” https://martinctc.github.io/blog/rstudio-projects-and-working-directories-a-beginner%27s-guide/