Split a file path into folder names vector
You can do it with a simple recursive function, by terminating when the dirname
doesn't change:
split_path <- function(x) if (dirname(x)==x) x else c(basename(x),split_path(dirname(x)))
split_path("/home/foo/stats/index.html")
[1] "index.html" "stats" "foo" "home" "/"
split_path("C:\\Windows\\System32")
[1] "System32" "Windows" "C:/"
split_path("~")
[1] "James" "Users" "C:/"
R Split character vector of file paths into list by parent directory?
You could combine split
and dirname
:
path <- c("/home/username/data/dir/GCZ98/GCZ98_1998_12_16.asc.gz",
"/home/username/data/dir/GCZ98/GCZ98_1998_12_20.asc.gz",
"/home/username/data/dir/GCZ99/GCZ99_1999_12_21.asc.gz",
"/home/username/data/dir/GCZ99/GCZ99_1999_12_23.asc.gz",
"/home/username/data/dir/GCZ99/GCZ99_1999_12_27.asc.gz",
"/home/username/data/dir/GCZ99/GCZ99_1999_12_28.asc.gz")
## split by basedir
split(path, dirname(path))
# $`/home/username/data/dir/GCZ98`
# [1] "/home/username/data/dir/GCZ98/GCZ98_1998_12_16.asc.gz" "/home/username/data/dir/GCZ98/GCZ98_1998_12_20.asc.gz"
#
# $`/home/username/data/dir/GCZ99`
# [1] "/home/username/data/dir/GCZ99/GCZ99_1999_12_21.asc.gz" "/home/username/data/dir/GCZ99/GCZ99_1999_12_23.asc.gz" "/home/username/data/dir/GCZ99/GCZ99_1999_12_27.asc.gz"
# [4] "/home/username/data/dir/GCZ99/GCZ99_1999_12_28.asc.gz"
How to split a dos path into its components in Python
I've been bitten loads of times by people writing their own path fiddling functions and getting it wrong. Spaces, slashes, backslashes, colons -- the possibilities for confusion are not endless, but mistakes are easily made anyway. So I'm a stickler for the use of os.path
, and recommend it on that basis.
(However, the path to virtue is not the one most easily taken, and many people when finding this are tempted to take a slippery path straight to damnation. They won't realise until one day everything falls to pieces, and they -- or, more likely, somebody else -- has to work out why everything has gone wrong, and it turns out somebody made a filename that mixes slashes and backslashes -- and some person suggests that the answer is "not to do that". Don't be any of these people. Except for the one who mixed up slashes and backslashes -- you could be them if you like.)
You can get the drive and path+file like this:
drive, path_and_file = os.path.splitdrive(path)
Get the path and the file:
path, file = os.path.split(path_and_file)
Getting the individual folder names is not especially convenient, but it is the sort of honest middling discomfort that heightens the pleasure of later finding something that actually works well:
folders = []
while 1:
path, folder = os.path.split(path)
if folder != "":
folders.append(folder)
elif path != "":
folders.append(path)
break
folders.reverse()
(This pops a "\"
at the start of folders
if the path was originally absolute. You could lose a bit of code if you didn't want that.)
How to split a path into separate strings?
Indeed, there is path_iterator
. But if you want elegance:
#include <boost/filesystem.hpp>
int main() {
for(auto& part : boost::filesystem::path("/tmp/foo.txt"))
std::cout << part << "\n";
}
Prints:
"/"
"tmp"
"foo.txt"
And
for(auto& part : boost::filesystem::path("/tmp/foo.txt"))
std::cout << part.c_str() << "\n";
prints
/
tmp
foo.txt
No need to worry about the moving parts
What would be the equivalent to str.split(/path/to/file, os.sep) from Python in R
See strsplit
strsplit("/path/to/file", .Platform$file.sep)
Get the first element from the file path
We can use cSplit
from splitstackshape
splitstackshape::cSplit(df, "path", "\\\\+", fixed = FALSE)
# path_1 path_2 path_3 path_4 path_5 path_6
#1: E: My Network Places.old.dat <NA> <NA> <NA> <NA>
#2: E: pagefile.sys <NA> <NA> <NA> <NA>
#3: E: Press_Dly_Diff_G_91.rbc <NA> <NA> <NA> <NA>
#4: E: TV_Press_Dly_Diff_A Retrospect Dantz <NA> <NA>
#5: E: TV_Press_Dly_Diff_A Retrospect TV_Press_Dly_Diff_A 1-TV_Press_Dly_Diff_A AA000083.rdb
#6: E: TV_Press_Dly_Diff_A Retrospect TV_Press_Dly_Diff_A 1-TV_Press_Dly_Diff_A AA000561.rdb
Or if you already know how many columns the data will be expanded we can also use separate
.
tidyr::separate(df, path, into = paste0('path', 1:6), sep = "\\\\+", fill = 'right')
data
df <- data.frame(path = x, stringsAsFactors = FALSE)
Related Topics
How to Install the Odbc Driver for Snowflake Successfully on an M1 Apple Silicon MAC
How to Install R Packages via Proxy [User + Password]
Replace Every Single Character at the Start of String That Matches a Regex Pattern
Let Each Plot in Facet_Grid Have Its Own Y-Axis Value
How to Calculate the Median on Grouped Dataset
Predict.Svm Does Not Predict New Data
Saving a File to Sharepoint with R
How to Show a Loading Screen When the Output Is Being Calculated in a Background Process
Using Both Color and Size Attributes in Hexagon Binning (Ggplot2)
Why Is 'Unlist(Lapply)' Faster Than 'Sapply'
Naive Bayes in Quanteda VS Caret: Wildly Different Results
Build Word Co-Occurence Edge List in R
Nls Troubles: Missing Value or an Infinity Produced When Evaluating the Model
Dplyr: Grouping and Summarizing/Mutating Data with Rolling Time Windows