Replace column values with NA based on a different column or row position with tidyverse
As the 'badBands' have length
greater than 1, use %in%
instead of ==
, also the case_when
is type sensitive, so it is better to have the correct NA
i.e. NA_real_
for the double
column
myData %>%
mutate(reflectanceSfp = case_when(bandNumber %in% badBands ~ NA_real_,
TRUE ~ reflectanceSfp))
# A tibble: 6 x 5
# reflectanceSfp wavelength bandNumber reflectanceDT wavelength1
# <dbl> <dbl> <dbl> <dbl> <dbl>
#1 NA 376. 1 0.000148 377.
#2 NA 381. 2 0.00589 382.
#3 0.0158 386. 3 0.0101 387.
#4 0.0200 391. 4 0.0110 392.
#5 0.0240 396. 5 0.0117 397.
#6 NA 401. 6 0.0149 402.
Or it is easier to use replace
here, where we have to specify only the replacing value that satisfies the logical condition and without the type check
myData %>%
mutate(reflectanceSfp = replace(reflectanceSfp,
bandNumber %in% badBands, NA))
Replace values in multiple columns with NA based on value in a different column
Here a solution that actually evaluates if the variable number
is 0 or 1 (previous solutions evaluated whether the varible that end with "_1" or "_2" are 1 or 0).
library(dplyr)
df %>%
mutate(across((ends_with("_1")), ~ na_if(number, 1)),
(across((ends_with("_2")), ~ na_if(number, 0))))
# A tibble: 6 x 6
id X_1 Y_1 number X_2 Y_2
<int> <int> <int> <int> <int> <int>
1 1 NA NA 1 1 1
2 1 0 0 0 NA NA
3 2 NA NA 1 1 1
4 2 0 0 0 NA NA
5 3 NA NA 1 1 1
6 3 0 0 0 NA NA
Edit (keep original values)
df %>%
mutate(across((ends_with("_1")), ~if_else(number == 1, NA_integer_, .))) %>%
mutate(across((ends_with("_2")), ~if_else(number == 0, NA_integer_, .)))
# A tibble: 6 x 6
id X_1 Y_1 number X_2 Y_2
<int> <int> <int> <int> <int> <int>
1 1 NA NA 1 1 3
2 1 1 3 0 NA NA
3 2 NA NA 1 2 4
4 2 2 4 0 NA NA
5 3 NA NA 1 1 3
6 3 1 3 0 NA NA
Data
df <- tibble::tribble(
~id, ~X_1, ~Y_1, ~number, ~X_2, ~Y_2,
1L, 1L, 3L, 1L, 1L, 3L,
1L, 1L, 3L, 0L, 1L, 3L,
2L, 2L, 4L, 1L, 2L, 4L,
2L, 2L, 4L, 0L, 2L, 4L,
3L, 1L, 3L, 1L, 1L, 3L,
3L, 1L, 3L, 0L, 1L, 3L
)
Replacing NA from a specific column with latest non-NA value from that row in R
If it is a large data.frame, it may be more efficient to use vectorized solution instead of looping over rows. Get the logical index of elements in 'col1' that are NA
('i1'), use max.col
to return the column index of first
non-NA element from columns 3 to 5 ('j1'), create a row/column index matrix (m1
) with cbind
, assign the 'col1' where there are missing values with the elements extracted from 3 to 5 columns using 'm1' and assign those elements to NA
df1 <- as.data.frame(df)
i1 <- is.na(df1$col1)
j1 <- max.col(!is.na(df1[3:5]), "first")
m1 <- cbind(which(i1), j1[i1])
df1$col1[i1] <- df1[3:5][m1]
df1[3:5][m1] <- NA
-output
> df1
fruits col1 col2 col3 col4
1 apple 4 5 10 20
2 banana 100 NA NA 4
3 ananas 10 NA 5 1
Replace multiple values in a dataframe with NA based on conditions given in another dataframe in R
Here is one method to assign i.e. loop across
columns that starts_with
'col' in first dataset ('df1'), create a single string vector by paste
ing the 'group', 'subgroup' and the corresponding column name (cur_column()
), check if that elements are %in%
the paste
d rows of 'df2' to create logical vector. Use that in replace
to replace those elements to NA
library(dplyr)
library(stringr)
library(purrr)
df1 <- df1 %>%
mutate(across(starts_with('col'),
~ replace(., str_c(group, subgroup, cur_column()) %in%
invoke(str_c, c(df2, sep = '')), NA) ))
-output
df1
# A tibble: 4 x 5
col_1 col_2 col_3 group subgroup
<dbl> <dbl> <dbl> <chr> <chr>
1 1 3 5 A p
2 NA 8 NA A q
3 5 NA NA B p
4 1 7 7 B q
Conditonally replace NA with value from other rows
Your mutate won't work because you did not assign any value to a variable. your mutate()
should look like this mutate(value = unique(value[is.na(value)]))
. Althought this will not be my approach. What I did below was create a look up table of distinct non NA values and then joined them onto the original dataset. valuedis should be the values you want.
temporal <- c("Monday", "Monday", "Tuesday", "Tuesday","Wednesday", "Wednesday", "Thursday", "Thursday", "Friday", "Friday","Monday", "Monday", "Tuesday", "Tuesday","Wednesday", "Wednesday", "Thursday", "Thursday", "Friday", "Friday")
spatial <- c("North", "South","North", "South","North", "South","North", "South","North", "South", "North", "South","North", "South","North", "South","North", "South","North", "South")
value <- c(NA,2,3,4,5,6,7,NA,9,10,1,NA,3,4,5,6,7,8,9,NA)
df <- as.data.frame(cbind(temporal, spatial, value))
library(dplyr)
dfdis <- df %>%
filter(!is.na(value)) %>%
distinct(temporal,spatial,value) %>%
rename(valuedis = value)
df2 <- left_join(df,dfdis, by = c("temporal","spatial"))
replace values with NA in several columns
We may do this in two steps - loop across
the columns that have 'VAR' followed by digits (\\d+
) in column names, replace
the values where the first two characters are not AA
or DD
to NA
, then replace
the corresponding DATE
column to NA
based on the NA
in the 'VAR1', 'VAR2' columns
library(dplyr)
library(stringr)
DF %>%
mutate(across(matches("^VAR\\d+$"),
~ replace(., !substr(., 1, 2) %in% c("AA", "DD"), NA)),
across(ends_with("DATE"),
~ replace(., is.na(get(str_remove(cur_column(), "DATE"))), NA)))
-output
# A tibble: 5 × 5
ID VAR1 VAR1DATE VAR2 VAR2DATE
<int> <chr> <chr> <chr> <chr>
1 1 AABB 2001-01-01 <NA> <NA>
2 2 AACC 2001-01-02 AACC 2001-01-02
3 3 <NA> <NA> DDCC 2001-01-03
4 4 DDAA 2001-01-04 <NA> <NA>
5 5 <NA> <NA> <NA> <NA>
Related Topics
Remove Last N Rows in Data Frame With the Arbitrary Number of Rows
Calculate Row Means on Subset of Columns
Removing Columns That Are All 0
Count Number of Rows Within Each Group
Apply Several Summary Functions on Several Variables by Group in One Call
Plot Two Graphs in Same Plot in R
Predict() - Maybe I'M Not Understanding It
How to Succinctly Write a Formula With Many Variables from a Data Frame
Create Counter Within Consecutive Runs of Certain Values
Ggplot Does Not Work If It Is Inside a For Loop Although It Works Outside of It
How to Convert a Data Frame Column to Numeric Type
How to Change the Spacing Between Legend Items in Ggplot2
Coerce Multiple Columns to Factors At Once
Pass a Data.Frame Column Name to a Function
How to Specifically Order Ggplot2 X Axis Instead of Alphabetical Order