R List Files with Multiple Conditions

R list files with multiple conditions

 Filter(function(x) grepl("USD", x), file.ls)

alternatively, you could construct a regular expression for pattern that only matches filenames containing both strings, but that's a wizard's game.

List files with multiple conditions

Try this:

list.files(path, recursive = TRUE, full.names = FALSE, 
pattern = "B0[2348].jp2$")

The pattern accepts a regular expression.

List files with multiple conditions part2

For other users and based on @docendo discimus answer, here is the idea to combine different conditions when listing files as in my case. My conditions are based on the numbers that are following the letter B so:

pattern="B( here we need to write the conditions).jp2$

First, we will set the condition to import the files B02_10m, B03_10m, B04_10m, B08_10m

patter="B(FIRST CONDITION OR SECOND CONDITION).jp2$
pattern="B((0[2348]_10m)|SECOND CONDITION).jp2$

Second, we will import the files B05_20m, B06_20m, B07_20m, B8A_20m, B11_20m, B12_20m. In this case, we have to combine several sub-conditions because the pattern changes from e.g.: 02 to 11, 12 and 8A

First we write the code for 5, 6 and 7

pattern="B((0[2348]_10m)|((0[567])_20m)).jp2$

Then we add the code for bands 11 and 12

pattern="B((0[2348]_10m)|((0[567])|(1[12])_20m)).jp2$

Then, the code for 8A

pattern="B((0[2348]_10m)|(((0[567])|(1[12])|(8A))_20m)).jp2$

Hope it's clear

List files with specific word and file extension

The list.files command has the options for wildcards, so you should be able to do something like:

list.files("/../directory", pattern = "*_2000*//.bil")

or maybe

list.files("/../directory", pattern = ".*_2000.*\\.bil")

I'm not 100% clear on whether list.files uses a regex pattern and I don't have access to R at the moment, so let me know if that works.

r find matching file for multiple condition

Uwe's comment might be simplest for you. If it can be in any order, then you need to be a little more creative.

Since I don't have your files or such, I'll create some samples:

# filelisting <- list.files(path=...) # no pattern
filelisting <- c(
"Rob travel v1.2.docx",
"the v1.2 version of travel for Rob.xlsx",
"the v1.3 version of travel for Rob.xlsx",
"the v1.2 version of travel for Carol.xlsx",
"something else entirely.pptx",
"C_Mu.R",
"My travel v1.2.txt"
)
c1 <- "Rob"
c2 <- "travel"
c3 <- "v1.2"

If you need all three but allowing for different orders, then

grepl(paste(c1,c2,c3,sep=".*"), filelisting)
# [1] TRUE FALSE FALSE FALSE FALSE FALSE FALSE

fails because it misses the second file.

Here's a thought:

sapply(c(c1,c2,c3), grepl, filelisting)
# Rob travel v1.2
# [1,] TRUE TRUE TRUE
# [2,] TRUE TRUE TRUE
# [3,] TRUE TRUE FALSE
# [4,] FALSE TRUE TRUE
# [5,] FALSE FALSE FALSE
# [6,] FALSE FALSE FALSE
# [7,] FALSE TRUE TRUE

From here, you can simply look for rows where all values are true, such as

apply(sapply(c(c1,c2,c3), grepl, filelisting), 1, all)
# [1] TRUE TRUE FALSE FALSE FALSE FALSE FALSE

(using that to index on filelisting).

You can generalize this a little if you have many more than three conditions and/or the number of conditions can change.

allcs <- c("Rob", "travel", "v1.2", "docx")
apply(sapply(allcs, grepl, filelisting), 1, all)
# [1] TRUE FALSE FALSE FALSE FALSE FALSE FALSE

Within each string you can use real regex-type stuff (which means you need to escape regex language):

allcs <- c("Rob", "travel", "v1.2", "xlsx|docx")
apply(sapply(allcs, grepl, filelisting), 1, all)
# [1] TRUE TRUE FALSE FALSE FALSE FALSE FALSE


Related Topics



Leave a reply



Submit