R: Sourcing Files Using a Relative Path

R: sourcing files using a relative path

After a discussion with @hadley on GitHub, I realized that my question goes against the common development patterns in R.

It seems that in R files that are sourced often assume that the working directory (getwd()) is set to the directory they are in. To make this work, source has a chdir argument whose default value is FALSE. When set to TRUE, it will change the working directory to the directory of the file being sourced.

In summary:

  1. Assume that source is always relative because the working directory of the file being sourced is set to the directory where the file is.

  2. To make this work, always set chdir=T when you source files from another directory, e.g., source('lib/stats/big_stats.R', chdir=T).

For convenient sourcing of entire directories in a predictable way I wrote sourceDir, which sources files in a directory in alphabetical order.

sourceDir <- function (path, pattern = "\\.[rR]$", env = NULL, chdir = TRUE) 
{
files <- sort(dir(path, pattern, full.names = TRUE))
lapply(files, source, chdir = chdir)
}

R: source() and path to source files

If you are distributing a script to colleagues, you should really not be writing a script that sources other scripts. What if you want to rename or move functions.R in the future? What if you need to modify a function in functions.R, but wrapper.R relies on the older version of that function? It's a flimsy solution that will cause headache. I would recommend either of the following instead.

  1. Put everything needed into a single, self-contained script and distribute that.

  2. If you really want to separate code into different files, write a package. Might sound like overkill, but packages can actually be very simple and lightweight. In the simplest form a package is just a directory with a DESCRIPTION and NAMESPACE file along with an R/ directory. Hadley breaks this down nicely: https://r-pkgs.org/whole-game.html.

How to use Rstudio relative paths

You could change the working directory. Get the address in the beginning getwd(), replace it by your project folder with setwd(). Then, when accessing a file just use read.table("./folder/file.R").

Relative paths in R: how to avoid my computer being set on fire?

There are many ways to organize code and data for use with R. Given that the "arsonist" described in the OP has rejected at least two approaches for locating the project files in an R script, the best next step is to ask the arsonist how s/he performs this function, and adjust your code and file structures accordingly.

UPDATE: Since the "arsonists" appear to be someone who writes on Tidyverse.org (see Tidyverse article in OP) and an answer on SO (see additional links in OP), your computer appears to be relatively safe.

If you are sharing code or executing it with batch processes where the "user" is someone other than you, a useful approach is to place the code, data, and configuration under version control, and develop a runbook to explain how others can retrieve the components and execute them on another computer.

As noted in the comments to the OP, there's nothing wrong with here::here() if its use can be made reliable through documentation in a runbook.

I structure all of my R code into Projects within RStudio, which are organized into a gitrepositories directory. All of the projects can be accessed as subdirectories from the gitrepositories directory. If I need to share a project, I make the project accessible to other users on GitHub.

In my R code I reference external files as subdirectories from the project root directory, such as ./data/gen01.csv.

R - Create filepath relative to specific file

Here is a function I wrote that does that. You need to specify the directory (main_dir) that contains both files.

rel_path <- function(target_file, ref_file, main_dir){
## Returns the path of a file relative to a given path within a given directory.
# Args:
# target_file: name of the file for which the relative path is to be returned.
# ref_file: name of the reference file.
# main_dir: path of the directory that encompases both the target and the reference files.
#
# Returns:
# String with the relative file path.
#

target_path <- list.files(path = main_dir,
pattern = target_file,
recursive = TRUE)

ref_path <- list.files(path = main_dir,
pattern = ref_file,
recursive = TRUE)

## Split paths into strings to check if they have common sub directories
ref_str <- (strsplit(ref_path,"/|\\\\")[[1]])
tar_str <- (strsplit(target_path,"/|\\\\")[[1]])

## compare
max_len <- min(length(tar_str), length(ref_str))
matched <- which(ref_str[1:max_len] == tar_str[1:max_len])

if (length(matched)==0){
matched = 0
}

if (length(ref_str) == 1){ ## The reference file is not inside a sub directory
rel_path = file.path(target_path)
}else if (length(matched) == length(tar_str) && length(tar_str) == length(ref_str) ){
rel_path = file.path(target_file)
}else if (max(matched) == 1){ ## Both files are under same sub directory
rel_path = file.path(target_path)
}else if (sum(matched) == 0){
count_up <- paste0(rep("..", length(ref_str)-1), collapse = "/")

rel_path = file.path(count_up, target_path)
}else{ ## files under different sub directory
count_up <- paste0(rep("..", max(matched)-1), collapse = "/")
rel_path = paste0(c(count_up,
paste0(tar_str[3:length(tar_str)], collapse = "/")),
collapse = "/")
}

return(rel_path)
}

This should then work provided that both are under the directory folder1.

> rel_path(target_file= 'Template.docx', 
ref_file = 'Script.R', main_dir = 'folder1')

[1] "../Templates/Template.docx"


Related Topics



Leave a reply



Submit