Apply a Function to Each Row in a Data Frame in R

Call apply-like function on each row of dataframe with multiple arguments from each row

You can apply apply to a subset of the original data.

 dat <- data.frame(x=c(1,2), y=c(3,4), z=c(5,6))
apply(dat[,c('x','z')], 1, function(x) sum(x) )

or if your function is just sum use the vectorized version:

rowSums(dat[,c('x','z')])
[1] 6 8

If you want to use testFunc

 testFunc <- function(a, b) a + b
apply(dat[,c('x','z')], 1, function(x) testFunc(x[1],x[2]))

EDIT To access columns by name and not index you can do something like this:

 testFunc <- function(a, b) a + b
apply(dat[,c('x','z')], 1, function(y) testFunc(y['z'],y['x']))

How to apply function to each row of dataframe in R?

You can use apply on the rows (MARGIN = 1).

apply(df, MARGIN = 1, function(x) concord(sourcevar = x[3], origin = x[1], destination = "HS4", dest.digit = x[2], all = F))

However, this does not work because there is no conversion dictionary between "HS4" and "HS4", so you can use apply only on the rows that are not HS4:

df$New <- df$Code
df[df$Model != "HS4", ]$New <- apply(df[df$Model != "HS4", ], 1, \(x) concord(sourcevar = x[colnames(df) == "Code"],
origin = x[colnames(df) == "Model"], destination = "HS4",
dest.digit = x[colnames(df) == "Length"], all = F))

Model Length Code New
1 HS5 6 030299 030289
2 HS5 6 010121 010121
3 HS5 6 030448 030449
4 HS4 6 030324 030324

Using the apply function on rows of a data.frame in R

The issue is that the string columns are factor class because while constructing the data.frame, the default option is stringsAsFactors = TRUE and the factor would get coerced to integer storage mode when we do paste across columns. To avoid this behavior use

df <- data.frame(x=c(1,2,3),y=c('b','a','c'), stringsAsFactors = FALSE)

paste(df[1,],collapse=":")
#[1] "1:b"

With apply, it converts to matrix and matrix can have only a single class, therefore it converts the numeric to 'character' when there is a character element based on the precedence of class

R apply() custom function to every row in data frame

Another approach is modifying your existing function such that it is vectorised.

    t.test2 <- function(m1,m2,s1,s2,n1,n2,m0=0,equal.variance=FALSE)
{
if(!equal.variance)
{
se <- sqrt( (s1^2/n1) + (s2^2/n2) )
# welch-satterthwaite df
df <- ( (s1^2/n1 + s2^2/n2)^2 )/( (s1^2/n1)^2/(n1-1) + (s2^2/n2)^2/(n2-1) )
} else
{
# pooled standard deviation, scaled by the sample sizes
se <- sqrt( (1/n1 + 1/n2) * ((n1-1)*s1^2 + (n2-1)*s2^2)/(n1+n2-2) )
df <- n1+n2-2
}
t <- (m1-m2-m0)/se
dat <- vapply(seq_len(length(m1)),
function(x){c(m1[x]-m2[x], se[x], t[x], 2*pt(-abs(t[x]),df[x]))},
numeric(4)) #one tailed m2 > m1. Replace with "2*pt(-abs(t),df))" for two tailed.
dat <- t(dat)
dat <- as.data.frame(dat)
names(dat) <- c("Difference of means", "Std Error", "t", "p-value")
return(dat)
}

This approach allows you to pass in vectors for your various inputs and it will provide a data frame of equal length to your inputs. It uses the vapply function to return a vector of length 4 for each value provided.

Under this approach, you can simply go

t.test2(MPAmeans$reference_mean, MPAmeans$MPA_mean, MPAmeans$sd_reference, MPAmeans$sd_MPA, MPAmeans$n_reference, MPAmeans$n_MPA)

(or whatever you end up calling your variables)

Apply a function to each row in a data frame in R

You want apply (see the docs for it). apply(var,1,fun) will apply to rows, apply(var,2,fun) will apply to columns.

> apply(a,1,min)
[1] 1 0 3

Apply a function to every row of a matrix or a data frame

You simply use the apply() function:

R> M <- matrix(1:6, nrow=3, byrow=TRUE)
R> M
[,1] [,2]
[1,] 1 2
[2,] 3 4
[3,] 5 6
R> apply(M, 1, function(x) 2*x[1]+x[2])
[1] 4 10 16
R>

This takes a matrix and applies a (silly) function to each row. You pass extra arguments to the function as fourth, fifth, ... arguments to apply().

Apply function to a row in a data.frame using dplyr

We just need the data to be specified as . as data.frame is a list with columns as list elements. If we wrap list(.), it becomes a nested list

library(dplyr)
d %>%
mutate(u = pmap_int(., ~ which.max(c(...))))
# a b c u
#1 1 4 2 2
#2 2 3 3 2
#3 3 2 4 3
#4 4 1 5 3

Or can use cur_data()

d %>%
mutate(u = pmap_int(cur_data(), ~ which.max(c(...))))

Or if we want to use everything(), place that inside select as list(everything()) doesn't address the data from which everything should be selected

d %>% 
mutate(u = pmap_int(select(., everything()), ~ which.max(c(...))))

Or using rowwise

d %>%
rowwise %>%
mutate(u = which.max(cur_data())) %>%
ungroup
# A tibble: 4 x 4
# a b c u
# <int> <int> <int> <int>
#1 1 4 2 2
#2 2 3 3 2
#3 3 2 4 3
#4 4 1 5 3

Or this is more efficient with max.col

max.col(d, 'first')
#[1] 2 2 3 3

Or with collapse

library(collapse)
dapply(d, which.max, MARGIN = 1)
#[1] 2 2 3 3

which can be included in dplyr as

d %>% 
mutate(u = max.col(cur_data(), 'first'))

Apply a function returning a data frame to each row in a data frame

You haven't shown what you have in f but based on comments it is written for dataframes, so this should work :

lapply(split(d, seq_len(nrow(d))), f)

split divides every row of d in 1 row-dataframe and using lapply we apply function f on each row.

You can also use by :

by(d, seq_len(nrow(d)), f)

Call a custom function on each row of dataframe with multiple arguments from each row

In R, and especially in your case, you can make use of vectorised functions. They work on the complete vector, so you don't have to apply the function separately for every row, but can directly supply the complete columns:

df <- data.frame(Name=c('John','Tom','Sarah'), Quantity=c(3,4,5), Price=c(5,6,7))

my_vectorised_fun <- function(name, quantity, price) {
sales <- quantity * price

# check for which the name doesn't fit
index_names <- !name %in% c("John", "Tom")
sales[index_names] <- NA

sales
}

library(dplyr)
df %>%
mutate(Sales = my_vectorised_fun(Name, Quantity, Price))
#> Name Quantity Price Sales
#> 1 John 3 5 15
#> 2 Tom 4 6 24
#> 3 Sarah 5 7 NA

Created on 2021-02-19 by the reprex package (v0.3.0)



Edit

Here is a version where you pass the complete .data pronoun to the function and only have to specify the names in the function:

df <- data.frame(Name=c('John','Tom','Sarah'), Quantity=c(3,4,5), Price=c(5,6,7))

my_vectorised_fun <- function(all_data) {
sales <- all_data[["Quantity"]] * all_data[["Price"]]

# check for which the name doesn't fit
index_names <- !all_data[["Name"]] %in% c("John", "Tom")
sales[index_names] <- NA

sales
}

library(dplyr)
df %>%
mutate(Sales = my_vectorised_fun(.data))
#> Name Quantity Price Sales
#> 1 John 3 5 15
#> 2 Tom 4 6 24
#> 3 Sarah 5 7 NA

Created on 2021-02-19 by the reprex package (v0.3.0)



Related Topics



Leave a reply



Submit