Using Dplyr to Conditionally Replace Values in a Column

Using dplyr to conditionally replace values in a column

Assuming your data frame is dat and your column is var:

dat = dat %>% mutate(candy.flag = factor(ifelse(var == "Candy", "Candy", "Non-Candy")))

Conditionally replace values in one column/row with values from another row/column using dplyr

How about this:

  library(dplyr)
#>
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#>
#> filter, lag
#> The following objects are masked from 'package:base':
#>
#> intersect, setdiff, setequal, union
dat <- tibble::tribble(~id, ~colours, ~tagl, ~TAGLFT, ~grid,
"6391", "B*/Y*", "45820", "45820", "KL",
"6391", "B*/Y*", "45820", "46272", "KL",
"8443", "B*/Y*", "46272", "46272", "SU")

updated<-dat %>%
mutate(tagl=ifelse(!tagl==TAGLFT, TAGLFT, tagl))

dat %>%
filter(tagl == TAGLFT) %>%
select(TAGLFT, id) %>%
right_join(updated %>% select(-id)) %>%
select(id, colours, tagl, TAGLFT, grid)
#> Joining, by = "TAGLFT"
#> # A tibble: 3 × 5
#> id colours tagl TAGLFT grid
#> <chr> <chr> <chr> <chr> <chr>
#> 1 6391 B*/Y* 45820 45820 KL
#> 2 8443 B*/Y* 46272 46272 KL
#> 3 8443 B*/Y* 46272 46272 SU

Created on 2022-04-05 by the reprex package (v2.0.1)

It works in this particular instance, you'll want to make sure it works on the full configuration of data that you have for your actual application.

Conditionally replace values with NA in R

It is case where the column is factor. Convert to character and it should work

library(dplyr)
have %>%
mutate(gender = as.character(gender),
gender = replace(gender, gender == "I Do Not Wish to Disclose", NA))

The change in values in gender is when it gets coerced to its integer storage values

as.integer(factor(c("Male", "Female", "Male")))

dplyr: Replace multiple values based on condition in a selection of columns

A dplyr solution:

library(dplyr)
dt %>%
mutate(across(3:5, ~ ifelse(measure == "led", stringr::str_replace_all(
as.character(.),
c("2" = "X", "3" = "Y")
), .)))

Result:

   measure site space qty qty.exit cf
1: led 4 1 4 6 3
2: exit 4 2 1 4 6
3: cfl 1 4 6 2 3
4: linear 3 4 1 3 5
5: cfl 5 1 6 1 6
6: exit 4 3 2 6 4
7: exit 5 1 4 2 5
8: exit 1 4 3 6 4
9: linear 3 1 5 4 1
10: led 4 1 1 1 1
11: exit 5 4 3 5 2
12: cfl 4 2 4 5 5
13: led 4 X Y Y 4
...

tidyverse and dplyr: Conditional replacement of values in a column based on other column

You are better using ifelse here :

library(dplyr)
tb1 %>% mutate(A4 = ifelse(Total == 63, A3 -1, A3))

As far as why replace does not work if you check the source code of replace :

replace
function (x, list, values)
{
x[list] <- values
x
}

It assigns values to x after subsetting for list.

When you use :

tb1 %>% mutate(A4 = replace(A3, Total == 63, A3-1))

your values is of length length(tb1$A3) but list is of length sum(tb1$Total == 63) which do not match hence you get the warning of number of items to replace is not a multiple of replacement length, since it tries recycling those values but still the length is unequal.

If you want to make replace work you can try :

tb1 %>%  mutate(A4 = replace(A3, Total == 63, A3[Total == 63] -1))

but again as I mentioned it is easier to just use ifelse here.

Conditionally replace values in one column with values from another column using dplyr

With replace, the lengths should be the same, so we need to subset the Other as well with the logical expression

data %>%
mutate(X25 = replace(X25, X25 == "Other", Other[X25=="Other"]))

Another option would be case_when

data %>%
mutate(X25 = case_when(X25=="Other"~ Other,
TRUE ~ X25))

Or ifelse

data %>%
mutate(X25 = ifelse(X25 == "Other", Other, X25))

Conditionally replace values in rows using dplyr

Your question makes this unclear, but if you have some default value that you always want to use to replace a missing value (e.g., if 1994 is your baseline), then I would recommend that you first generate those defaults:

defaultValues <-
df %>%
filter(year == 1994) %>%
select(groups
, default_var1 = var1
, default_var2 = var2)

Then, use left_join to merge on the groups. That way, each row will now also have a default. You can then use coalesce to pick the first non-NA value -- which will be the default if and only if the value is missing. End by cleaning away the default values.

df %>%
left_join(defaultValues) %>%
mutate(var1 = coalesce(var1, default_var1)
, var2 = coalesce(var2, default_var2)) %>%
select(-starts_with("default"))

If your defaults are more complex, you would just need to construct them to match your desired behavior. For example, if you want it to fill in the value from two years prior, use:

complex_defaultValues <-
df %>%
mutate(year = year + 2) %>%
rename(default_var1 = var1
, default_var2 = var2)

then, join on both year and group, and it will correctly align (though note that if the value from two years ago are missing, it will still be missing after coalesce. So, you may need to account for the missings in your defaults as well.)

Finally, if you just want to propagate the last non-NA value forward (instead of trying to go back two years, or always using the same default), you can use fill from tidyr:

df %>%
group_by(groups) %>%
fill(var1, var2)

Which will automatically fill down (so make sure your data are sorted in the way you want)

Conditionally replace all values in a column in R

You can turn the values to 0 where eh1 = 1 and the row number is after the first occurrence of 1 in eh2.

library(dplyr)
mydata %>%
mutate(eh1 = replace(eh1, row_number() > which(eh2==1)[1] & eh1 == 1, 0))
#mutate(eh1 = replace(eh1, row_number() > which.max(eh2) & eh1 == 1, 0))
#mutate(eh1 = replace(eh1, lag(cummax(eh2) > 0 & eh1 == 1), 0))
#mutate(eh1 = replace(eh1, lag(cumsum(eh2) > 0) & eh1 == 1, 0))

# eh1 eh2
#1 1 0
#2 1 0
#3 1 1
#4 0 0
#5 0 0
#6 0 0
#7 0 0
#8 0 0

The same can be translated in base R :

transform(mydata,eh1 = replace(eh1,seq_along(eh2) > which.max(eh2) & eh1 == 1, 0))


Related Topics



Leave a reply



Submit