Using Ifelse Statement on the Whole Dataset Instead of a Single Column

Using ifelse statement on the whole dataset instead of a single column

You could do

DF[] <- as.integer(DF > 0)
DF
# Var1 Var2 Var3
#1 1 1 1
#2 1 1 0
#3 0 1 1
#4 1 1 1
#5 1 0 1

In case you want to extend your dataframe, try

DF[paste0(names(DF), "_Binary")] <- as.integer(DF > 0)
DF
# Var1 Var2 Var3 Var1_Binary Var2_Binary Var3_Binary
#1 1 1 1 1 1 1
#2 3 2 0 1 1 0
#3 0 1 2 0 1 1
#4 3 3 1 1 1 1
#5 5 0 3 1 0 1

data

DF <- structure(list(Var1 = c(1L, 3L, 0L, 3L, 5L), Var2 = c(1L, 2L, 
1L, 3L, 0L), Var3 = c(1L, 0L, 2L, 1L, 3L)), .Names = c("Var1",
"Var2", "Var3"), row.names = c(NA, -5L), class = "data.frame")

How to shift value with inside nested ifelse statement?

Using dplyr one way is to group_by group_UID and occurrence of "Non" value and assign NA to first row and first id in each group otherwise.

library(dplyr)

df %>%
group_by(group_UID, group = cumsum(Amount_type == "Non")) %>%
mutate(p_RID = ifelse(row_number() == 1, NA, id[1L])) %>%
ungroup() %>%
select(-group)


# id user_id Amount_type group_UID p_RID
# <int> <int> <fct> <int> <int>
#1 30 11 Non 1 NA
#2 31 11 Draw 1 30
#3 54 5 Non 2 NA
#4 322 5 Draw 2 54
#5 21 5 Draw 2 54
#6 13 5 Non 2 NA
#7 2445 5 Draw 2 13
#8 111 44 Non 3 NA
#9 287 44 Draw 3 111

Another way would be

df %>%
group_by(group_UID, group = cumsum(Amount_type == "Non")) %>%
mutate(p_RID = ifelse(Amount_type == "Non", NA, first(id))) %>%
ungroup() %>%
select(-group)

We can also use base R ave here

with(df, ave(id, group_UID, cumsum(Amount_type == "Non"), FUN = function(x) 
ifelse(seq_along(x) == 1, NA, x[1L])))

#[1] NA 30 NA 54 54 NA 13 NA 111

Using the ifelse statement in R

You can do it in a two-liner instead:

z <- data$x
z[data$ind == 0] <- colSums(m[,data$ind == 0])

[1] -1.3367324 0.1836433 1.3413668 1.5952808 4.5120996 -0.8204684 1.2736029 0.7383247 3.4748021
[10] -0.3053884

more generally, you could use an apply function. This will in general be slower than a straight vectorised solution, like the above. Here's sapply:

sapply(1:nrow(data), function(x){ifelse(data$ind[x] == 1, data$x[x], sum(m[, x]))})

[1] -1.3367324 0.1836433 1.3413668 1.5952808 4.5120996 -0.8204684 1.2736029 0.7383247 3.4748021
[10] -0.3053884

A benchmark:

microbenchmark::microbenchmark(
sapply = sapply(1:nrow(data), function(x){ifelse(data$ind[x] == 1, data$x[x], sum(m[, x]))}),
vectorised = {z <- data$x;
z[data$ind == 0] <- colSums(m[,data$ind == 0])})
Unit: microseconds
expr min lq mean median uq max neval cld
sapply 391.297 408.193 423.6525 412.4170 423.7450 853.249 100 b
vectorised 197.377 199.873 208.7701 202.5605 214.4645 284.545 100 a

R ifelse statement

Add another variable B to dataset and use ifelse function where you get 0 for "N"and 1 for "Y" values

Dataset$B <- ifelse(Dataset$A=="N",0,1)

or you can use ifelse function on same variable as

Dataset$A <- ifelse(Dataset$A=="N",0,1)

Using If/Else on a data frame

Use ifelse:

frame$twohouses <- ifelse(frame$data>=2, 2, 1)
frame
data twohouses
1 0 1
2 1 1
3 2 2
4 3 2
5 4 2
...
16 0 1
17 2 2
18 1 1
19 2 2
20 0 1
21 4 2

The difference between if and ifelse:

  • if is a control flow statement, taking a single logical value as an argument
  • ifelse is a vectorised function, taking vectors as all its arguments.

The help page for if, accessible via ?"if" will also point you to ?ifelse

Using ifelse statement in R dataframe to generate additional variables

Up front: I think the use of ifelse statements in this problem is strongly ill-advised. It requires significant nesting, sacrificing performance and readability. Though these two solutions may be a little harder if you aren't familiar with mapply or table-join-calculus, the payoff in stability and performance will far outweigh the time to learn these techniques.

Two methods:

Lookup matrix

One way is to define look-up arrays, where the row names reflect the possible V1 values, and the column names reflect the possible V2 values. (Note that when referencing these lookup matrices, one must use as.character if your values are numeric/integer, since otherwise they will look for the slice/row number, not the specific matching column/row.)

Examples:

dat <- data.frame(
V1 = c(0,0,0,1,1,1,2,2,2),
V2 = c(0,1,2,0,1,2,0,1,2)
)
dmnms <- list(c(0,1,2), c(0,1,2))
m3 <- matrix(c(0, 1, 2,
0, NA, 1,
0, 0, 0),
nrow = 3, byrow = TRUE, dimnames = dmnms)
m4 <- matrix(c("AA", "AD", "DD",
"AB", NA, "CD",
"BB", "BC", "CC"),
nrow = 3, byrow = TRUE, dimnames = dmnms)

m3
# 0 1 2
# 0 0 1 2
# 1 0 NA 1
# 2 0 0 0
m4
# 0 1 2
# 0 "AA" "AD" "DD"
# 1 "AB" NA "CD"
# 2 "BB" "BC" "CC"

in this case, notice the 0, 1, and 2 in the row/column margins. In a matrix with no names, these are typically [1,], [2,], etc, indicating that actual names are not available, instead reflecting just the row number. However, since these are character (no brackets/commas), they can be referenced directly, ala

m3["0","2"]
# [1] 2
m4["1","0"]
# [1] "AB"

From here, you just need to map these lookups into new columns, something like:

dat$V3 <- mapply(`[`, list(m3), as.character(dat$V1), as.character(dat$V2))
dat$V4 <- mapply(`[`, list(m4), as.character(dat$V1), as.character(dat$V2))
dat
# V1 V2 V3 V4
# 1 0 0 0 AA
# 2 0 1 1 AD
# 3 0 2 2 DD
# 4 1 0 0 AB
# 5 1 1 NA <NA>
# 6 1 2 1 CD
# 7 2 0 0 BB
# 8 2 1 0 BC
# 9 2 2 0 CC

Joining data.frame

Another method is to join a known data.frame onto your data. This has an added benefit of easily expanding to more than two criteria. (Technically, the matrix method can expand to more than 2, in which case it would be an n-dim array, but it is often a little harder to edit, manage, and visualize.)

In your example, this doesn't initially gain you much, since you need to pre-define your data.frame, but I'm guessing that this is just representative data, and your conditional classification is on much more data.

I'll define the joiner data.frame that will be used against your actual data. This is the reference data, from which all input permutations will be defined into the respective V3 and V4 values.

joiner <- data.frame(
V1 = c(0,0,0,1,1,1,2,2,2),
V2 = c(0,1,2,0,1,2,0,1,2),
V3 = c(0, 1, 2, 0, NA, 1, 0, 0, 0),
V4 = c("AA", "AD", "DD", "AB", NA, "CD", "BB", "BC", "CC"),
stringsAsFactors = FALSE
)

I'll create a sample second data to demonstrate the merge:

dat2 <- data.frame(
V1 = c(2, 0, 1, 0),
V2 = c(0, 1, 2, 2)
)
merge(dat2, joiner, by = c("V1", "V2"))
# V1 V2 V3 V4
# 1 0 1 1 AD
# 2 0 2 2 DD
# 3 1 2 1 CD
# 4 2 0 0 BB

Edit: if you are concerned about dropping rows, then add all.x=TRUE to the merge command. If (as you saw based on your comment) you use all=TRUE, this is a full join in SQL parlance, meaning it will keep all rows from both tables, even if there is not a match made. This may be better explained by referencing this answer and noting that I'm suggesting a left join with all.x, keeping all on the left (first argument), only merging in rows on the right where a match is made.

(Note: this can also be done quite easily using dplyr and data.table packages.)

ifelse statement in R to assign values to a new column

The following should work;

Trial <- c(1, 1, 1, 1, 2, 2, 2, 3, 3, 3)
ContourFix <- c(1, 0, 0, 0, 0, 1, 0, 1, 0, 0)

trial.ends <- c(which(diff(Trial)==1),length(Trial))
one.starts <- which(ContourFix ==1)

TrialFix <- rep(0,length(Trial))
for (i in 1:length(one.starts)){
TrialFix[one.starts[i]:trial.ends[i]] <- 1
}

It's a bit hacky but should serve your purposes. It requires that every set of trials has at least one corresponding value for ContourFix and that your data is grouped as in the example.

Using if else on a dataframe across multiple columns

For your example dataset this will work;

Option 1, name the columns to change:

dat[which(dat$desc == "blank"), c("x", "y", "z")] <- NA

In your actual data with 40 columns, if you just want to set the last 39 columns to NA, then the following may be simpler than naming each of the columns to change;

Option 2, select columns using a range:

dat[which(dat$desc == "blank"), 2:40] <- NA

Option 3, exclude the 1st column:

dat[which(dat$desc == "blank"), -1] <- NA

Option 4, exclude a named column:

dat[which(dat$desc == "blank"), !names(dat) %in% "desc"] <- NA

As you can see, there are many ways to do this kind of operation (this is far from a complete list), and understanding how each of these options works will help you to get a better understanding of the language.

R ifelse to replace values in a column

This should work, using the working example:

var <- c("Private", "Private", "?", "Private")
df <- data.frame(var)
df$var[which(df$var == "?")] = "Private"

Then this will replace the values of "?" with "Private"

The reason your replacement isn't working (I think) is as if the value in df$var isn't "?" then it replaces the element of the vector with the whole df$var column, not just reinserting the element you want.



Related Topics



Leave a reply



Submit