Conditionally Replace Values of Subset of Rows With Column Name in R Using Only Tidy

Conditionally replace values of subset of rows with column name in R using only tidy

You can do a little tidyeval in your mutate_at function to get the column name, then an ifelse (or whatever other logic structure you might want) to replace certain values.

library(tidyverse)

wl %>%
  mutate_at(vars(starts_with("AB")), function(x) {
    x_var <- rlang::enquo(x)
    ifelse(x == "Y", rlang::quo_name(x_var), x)
  })
#> # A tibble: 4 x 5
#>       x multi ABC   ABD   ABE  
#>   <int> <chr> <chr> <chr> <chr>
#> 1     1 Y     ""    ""    ABE  
#> 2     2 Y     ABC   ""    ABE  
#> 3     3 Y     ABC   ""    ""   
#> 4     4 Y     ""    ""    ABE

Created on 2018-08-16 by the reprex package (v0.2.0).

Using dplyr to conditionally replace values in a column

Assuming your data frame is dat and your column is var:

dat = dat %>% mutate(candy.flag = factor(ifelse(var == "Candy", "Candy", "Non-Candy")))

Replace only some NA values for selected rows and for only a column in R

df$type[!df$Asked & is.na(df$type)] <- "Replies" gets you to your desired table:

> type <-
+   c(NA, rep("Question",3), NA, NA,  rep("Answer",4), rep(NA, 3), rep("Answer",2),
+     NA, "Question", NA, rep("Answer",2), NA,NA)
> Asked <- c(
+   T, rep(F, 9), T, rep(F, 4), T, rep(F, 4), T,F
+ )
> df <- data.frame(title = 1:22, comments = 1:22, type, Asked)
> df$type[!df$Asked & is.na(df$type)] <- "Replies"
> df
   title comments     type Asked
1      1        1     <NA>  TRUE
2      2        2 Question FALSE
3      3        3 Question FALSE
4      4        4 Question FALSE
5      5        5  Replies FALSE
6      6        6  Replies FALSE
7      7        7   Answer FALSE
8      8        8   Answer FALSE
9      9        9   Answer FALSE
10    10       10   Answer FALSE
11    11       11     <NA>  TRUE
12    12       12  Replies FALSE
13    13       13  Replies FALSE
14    14       14   Answer FALSE
15    15       15   Answer FALSE
16    16       16     <NA>  TRUE
17    17       17 Question FALSE
18    18       18  Replies FALSE
19    19       19   Answer FALSE
20    20       20   Answer FALSE
21    21       21     <NA>  TRUE
22    22       22  Replies FALSE

dplyr mutate/replace several columns on a subset of rows

These solutions (1) maintain the pipeline, (2) do not overwrite the input and (3) only require that the condition be specified once:

1a) mutate_cond Create a simple function for data frames or data tables that can be incorporated into pipelines. This function is like mutate but only acts on the rows satisfying the condition:

mutate_cond <- function(.data, condition, ..., envir = parent.frame()) {
  condition <- eval(substitute(condition), .data, envir)
  .data[condition, ] <- .data[condition, ] %>% mutate(...)
  .data
}

DF %>% mutate_cond(measure == 'exit', qty.exit = qty, cf = 0, delta.watts = 13)

1b) mutate_last This is an alternative function for data frames or data tables which again is like mutate but is only used within group_by (as in the example below) and only operates on the last group rather than every group. Note that TRUE > FALSE so if group_by specifies a condition then mutate_last will only operate on rows satisfying that condition.

mutate_last <- function(.data, ...) {
  n <- n_groups(.data)
  indices <- attr(.data, "indices")[[n]] + 1
  .data[indices, ] <- .data[indices, ] %>% mutate(...)
  .data
}


DF %>% 
   group_by(is.exit = measure == 'exit') %>%
   mutate_last(qty.exit = qty, cf = 0, delta.watts = 13) %>%
   ungroup() %>%
   select(-is.exit)

2) factor out condition Factor out the condition by making it an extra column which is later removed. Then use ifelse, replace or arithmetic with logicals as illustrated. This also works for data tables.

library(dplyr)

DF %>% mutate(is.exit = measure == 'exit',
              qty.exit = ifelse(is.exit, qty, qty.exit),
              cf = (!is.exit) * cf,
              delta.watts = replace(delta.watts, is.exit, 13)) %>%
       select(-is.exit)

3) sqldf We could use SQL update via the sqldf package in the pipeline for data frames (but not data tables unless we convert them -- this may represent a bug in dplyr. See dplyr issue 1579). It may seem that we are undesirably modifying the input in this code due to the existence of the update but in fact the update is acting on a copy of the input in the temporarily generated database and not on the actual input.

library(sqldf)

DF %>% 
   do(sqldf(c("update '.' 
                 set 'qty.exit' = qty, cf = 0, 'delta.watts' = 13 
                 where measure = 'exit'", 
              "select * from '.'")))

4) row_case_when Also check out row_case_when defined in
Returning a tibble: how to vectorize with case_when? . It uses a syntax similar to case_when but applies to rows.

library(dplyr)

DF %>%
  row_case_when(
    measure == "exit" ~ data.frame(qty.exit = qty, cf = 0, delta.watts = 13),
    TRUE ~ data.frame(qty.exit, cf, delta.watts)
  )

Note 1: We used this as DF

set.seed(1)
DF <- data.frame(site = sample(1:6, 50, replace=T),
                 space = sample(1:4, 50, replace=T),
                 measure = sample(c('cfl', 'led', 'linear', 'exit'), 50, 
                               replace=T),
                 qty = round(runif(50) * 30),
                 qty.exit = 0,
                 delta.watts = sample(10.5:100.5, 50, replace=T),
                 cf = runif(50))

Note 2: The problem of how to easily specify updating a subset of rows is also discussed in dplyr issues 134, 631, 1518 and 1573 with 631 being the main thread and 1573 being a review of the answers here.

R: Conditionally replacing values based on column pre-fixes and suffixes

Another attempt which should essentially only be one assignment operation. Using @alistaire's data again:

vars <- c("x","y")
foo[vars] <- Map(pmax, foo[vars], bar[match(foo$id, bar$id), vars], na.rm=TRUE)
foo

#  id  x y z
#1  1 10 1 1
#2  2  9 2 2
#3  3 NA 3 3
#4  4  1 4 4
#5  5  3 5 5
#6  6  8 6 6

dplyr: Replace multiple values based on condition in a selection of columns

A dplyr solution:

library(dplyr)
dt %>%
  mutate(across(3:5, ~ ifelse(measure == "led", stringr::str_replace_all(
    as.character(.),
    c("2" = "X", "3" = "Y")
  ), .)))

Result:

   measure site space qty qty.exit cf
 1:     led    4     1   4        6  3
 2:    exit    4     2   1        4  6
 3:     cfl    1     4   6        2  3
 4:  linear    3     4   1        3  5
 5:     cfl    5     1   6        1  6
 6:    exit    4     3   2        6  4
 7:    exit    5     1   4        2  5
 8:    exit    1     4   3        6  4
 9:  linear    3     1   5        4  1
10:     led    4     1   1        1  1
11:    exit    5     4   3        5  2
12:     cfl    4     2   4        5  5
13:     led    4     X   Y        Y  4
...

How to replace if the NA values in any column that should replace values by the next column's values in R programming

I guess you already have answer to the first part of your question, here is an alternative way using replace. To drop columns that have all NA in them you can use select with where.

library(dplyr)

df1 %>%
  mutate(across(.fns = ~replace(., . == '', 'N')), 
         GID = sub('N', '', GID)) %>%
  select(-where(~all(is.na(.)))) %>%
  rename_with(~names(df1)[seq_along(.)])

#  GID ColA
#1   1    2
#2   2    4
#3   3    4
#4   4    5
#5   5    5
#6  G1    N
#7 MG2    1
#8 MG3    1
#9  G4    N

conditionally renaming cells based on their current value

Would something like this, using the tidyverse,

First, loading packages,

# install.packages(c("tidyverse"), dependencies = TRUE) 
library(tidyverse)

Second, creating data, (see other examples)

df <- tribble(
  ~name, ~sub_name,  ~level,
  "Food", "Food",  "group",
  "Food", "Fruit and vegetables",  "subgroup",
  "Food", "Meat, poultry and fish",  "subgroup")
df
# A tibble: 3 x 3
  name  sub_name               level   
  <chr> <chr>                  <chr>   
1 Food  Food                   group   
2 Food  Fruit and vegetables   subgroup
3 Food  Meat, poultry and fish subgroup

Third, recode using case_when (see more examples)

df <- df %>% mutate(level = case_when(
  level == "group" ~    "primary",
  level == "subgroup" ~ "secondary",
  TRUE                      ~ "other"
))

Forth, take a look at the recoded data,

df
# A tibble: 3 x 3
  name  sub_name               level    
  <chr> <chr>                  <chr>    
1 Food  Food                   primary  
2 Food  Fruit and vegetables   secondary
3 Food  Meat, poultry and fish secondary

Fifth, filter() (see more filter options)

df2 <- df %>% filter(level != "primary")

df2
# A tibble: 2 x 3
  name  sub_name               level    
  <chr> <chr>                  <chr>    
1 Food  Fruit and vegetables   secondary
2 Food  Meat, poultry and fish secondary

Replace value with the name of its respective column

The coding below enabled me to replace every "true" value (character) into its respective column name.

##Replace every "true" value with its respective column name
w <- which(df=="true",arr.ind=TRUE)
df[w] <- names(df)[w[,"col"]]

Conditionally Replace Values of Subset of Rows With Column Name in R Using Only Tidy