Using dplyr to conditionally replace values in a column
Assuming your data frame is dat
and your column is var
:
dat = dat %>% mutate(candy.flag = factor(ifelse(var == "Candy", "Candy", "Non-Candy")))
Conditionally replace values in one column/row with values from another row/column using dplyr
How about this:
library(dplyr)
#>
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#>
#> filter, lag
#> The following objects are masked from 'package:base':
#>
#> intersect, setdiff, setequal, union
dat <- tibble::tribble(~id, ~colours, ~tagl, ~TAGLFT, ~grid,
"6391", "B*/Y*", "45820", "45820", "KL",
"6391", "B*/Y*", "45820", "46272", "KL",
"8443", "B*/Y*", "46272", "46272", "SU")
updated<-dat %>%
mutate(tagl=ifelse(!tagl==TAGLFT, TAGLFT, tagl))
dat %>%
filter(tagl == TAGLFT) %>%
select(TAGLFT, id) %>%
right_join(updated %>% select(-id)) %>%
select(id, colours, tagl, TAGLFT, grid)
#> Joining, by = "TAGLFT"
#> # A tibble: 3 × 5
#> id colours tagl TAGLFT grid
#> <chr> <chr> <chr> <chr> <chr>
#> 1 6391 B*/Y* 45820 45820 KL
#> 2 8443 B*/Y* 46272 46272 KL
#> 3 8443 B*/Y* 46272 46272 SU
Created on 2022-04-05 by the reprex package (v2.0.1)
It works in this particular instance, you'll want to make sure it works on the full configuration of data that you have for your actual application.
Conditionally replace values with NA in R
It is case where the column is factor
. Convert to character
and it should work
library(dplyr)
have %>%
mutate(gender = as.character(gender),
gender = replace(gender, gender == "I Do Not Wish to Disclose", NA))
The change in values in gender
is when it gets coerced to its integer storage values
as.integer(factor(c("Male", "Female", "Male")))
dplyr: Replace multiple values based on condition in a selection of columns
A dplyr
solution:
library(dplyr)
dt %>%
mutate(across(3:5, ~ ifelse(measure == "led", stringr::str_replace_all(
as.character(.),
c("2" = "X", "3" = "Y")
), .)))
Result:
measure site space qty qty.exit cf
1: led 4 1 4 6 3
2: exit 4 2 1 4 6
3: cfl 1 4 6 2 3
4: linear 3 4 1 3 5
5: cfl 5 1 6 1 6
6: exit 4 3 2 6 4
7: exit 5 1 4 2 5
8: exit 1 4 3 6 4
9: linear 3 1 5 4 1
10: led 4 1 1 1 1
11: exit 5 4 3 5 2
12: cfl 4 2 4 5 5
13: led 4 X Y Y 4
...
tidyverse and dplyr: Conditional replacement of values in a column based on other column
You are better using ifelse
here :
library(dplyr)
tb1 %>% mutate(A4 = ifelse(Total == 63, A3 -1, A3))
As far as why replace
does not work if you check the source code of replace
:
replace
function (x, list, values)
{
x[list] <- values
x
}
It assigns values
to x
after subsetting for list
.
When you use :
tb1 %>% mutate(A4 = replace(A3, Total == 63, A3-1))
your values
is of length length(tb1$A3)
but list
is of length sum(tb1$Total == 63)
which do not match hence you get the warning of number of items to replace is not a multiple of replacement length
, since it tries recycling those values but still the length is unequal.
If you want to make replace
work you can try :
tb1 %>% mutate(A4 = replace(A3, Total == 63, A3[Total == 63] -1))
but again as I mentioned it is easier to just use ifelse
here.
Conditionally replace values in one column with values from another column using dplyr
With replace
, the lengths should be the same, so we need to subset the Other
as well with the logical expression
data %>%
mutate(X25 = replace(X25, X25 == "Other", Other[X25=="Other"]))
Another option would be case_when
data %>%
mutate(X25 = case_when(X25=="Other"~ Other,
TRUE ~ X25))
Or ifelse
data %>%
mutate(X25 = ifelse(X25 == "Other", Other, X25))
Conditionally replace values in rows using dplyr
Your question makes this unclear, but if you have some default value that you always want to use to replace a missing value (e.g., if 1994 is your baseline), then I would recommend that you first generate those defaults:
defaultValues <-
df %>%
filter(year == 1994) %>%
select(groups
, default_var1 = var1
, default_var2 = var2)
Then, use left_join
to merge on the groups. That way, each row will now also have a default. You can then use coalesce
to pick the first non-NA value -- which will be the default if and only if the value is missing. End by cleaning away the default values.
df %>%
left_join(defaultValues) %>%
mutate(var1 = coalesce(var1, default_var1)
, var2 = coalesce(var2, default_var2)) %>%
select(-starts_with("default"))
If your defaults are more complex, you would just need to construct them to match your desired behavior. For example, if you want it to fill in the value from two years prior, use:
complex_defaultValues <-
df %>%
mutate(year = year + 2) %>%
rename(default_var1 = var1
, default_var2 = var2)
then, join on both year and group, and it will correctly align (though note that if the value from two years ago are missing, it will still be missing after coalesce
. So, you may need to account for the missings in your defaults as well.)
Finally, if you just want to propagate the last non-NA value forward (instead of trying to go back two years, or always using the same default), you can use fill
from tidyr
:
df %>%
group_by(groups) %>%
fill(var1, var2)
Which will automatically fill down (so make sure your data are sorted in the way you want)
Conditionally replace all values in a column in R
You can turn the values to 0 where eh1 = 1
and the row number is after the first occurrence of 1 in eh2
.
library(dplyr)
mydata %>%
mutate(eh1 = replace(eh1, row_number() > which(eh2==1)[1] & eh1 == 1, 0))
#mutate(eh1 = replace(eh1, row_number() > which.max(eh2) & eh1 == 1, 0))
#mutate(eh1 = replace(eh1, lag(cummax(eh2) > 0 & eh1 == 1), 0))
#mutate(eh1 = replace(eh1, lag(cumsum(eh2) > 0) & eh1 == 1, 0))
# eh1 eh2
#1 1 0
#2 1 0
#3 1 1
#4 0 0
#5 0 0
#6 0 0
#7 0 0
#8 0 0
The same can be translated in base R :
transform(mydata,eh1 = replace(eh1,seq_along(eh2) > which.max(eh2) & eh1 == 1, 0))
Related Topics
Dynamically Converting a List of Excel Files to CSV Files in R
How to Get Parameters from Config File in R Script
How to Suppress Output When Using ':=' in R {Data.Table}, Prior to V1.8.3
How to Screenshot a Website Using R
Automatic Adjustment of Margins in Horizontal Bar Chart
Match and Replace Multiple Strings in a Vector of Text Without Looping in R
Row-By-Row Operations and Updates in Data.Table
Create Category Based on Range in R
How to Extract Elements from a List with Mixed Elements
Histogram with "Negative" Logarithmic Scale in R
R: Text Progress Bar in for Loop
Difference Between Read.Csv() and Read.Csv2() in R
Calculate Rolling Correlation Using Rollapply