Call apply-like function on each row of dataframe with multiple arguments from each row
You can apply apply
to a subset of the original data.
dat <- data.frame(x=c(1,2), y=c(3,4), z=c(5,6))
apply(dat[,c('x','z')], 1, function(x) sum(x) )
or if your function is just sum use the vectorized version:
rowSums(dat[,c('x','z')])
[1] 6 8
If you want to use testFunc
testFunc <- function(a, b) a + b
apply(dat[,c('x','z')], 1, function(x) testFunc(x[1],x[2]))
EDIT To access columns by name and not index you can do something like this:
testFunc <- function(a, b) a + b
apply(dat[,c('x','z')], 1, function(y) testFunc(y['z'],y['x']))
How to apply function to each row of dataframe in R?
You can use apply
on the rows (MARGIN = 1
).
apply(df, MARGIN = 1, function(x) concord(sourcevar = x[3], origin = x[1], destination = "HS4", dest.digit = x[2], all = F))
However, this does not work because there is no conversion dictionary between "HS4" and "HS4", so you can use apply
only on the rows that are not HS4:
df$New <- df$Code
df[df$Model != "HS4", ]$New <- apply(df[df$Model != "HS4", ], 1, \(x) concord(sourcevar = x[colnames(df) == "Code"],
origin = x[colnames(df) == "Model"], destination = "HS4",
dest.digit = x[colnames(df) == "Length"], all = F))
Model Length Code New
1 HS5 6 030299 030289
2 HS5 6 010121 010121
3 HS5 6 030448 030449
4 HS4 6 030324 030324
Using the apply function on rows of a data.frame in R
The issue is that the string columns are factor
class because while constructing the data.frame
, the default option is stringsAsFactors = TRUE
and the factor
would get coerced to integer storage mode when we do paste
across columns. To avoid this behavior use
df <- data.frame(x=c(1,2,3),y=c('b','a','c'), stringsAsFactors = FALSE)
paste(df[1,],collapse=":")
#[1] "1:b"
With apply
, it converts to matrix
and matrix can have only a single class, therefore it converts the numeric to 'character' when there is a character element based on the precedence of class
R apply() custom function to every row in data frame
Another approach is modifying your existing function such that it is vectorised.
t.test2 <- function(m1,m2,s1,s2,n1,n2,m0=0,equal.variance=FALSE)
{
if(!equal.variance)
{
se <- sqrt( (s1^2/n1) + (s2^2/n2) )
# welch-satterthwaite df
df <- ( (s1^2/n1 + s2^2/n2)^2 )/( (s1^2/n1)^2/(n1-1) + (s2^2/n2)^2/(n2-1) )
} else
{
# pooled standard deviation, scaled by the sample sizes
se <- sqrt( (1/n1 + 1/n2) * ((n1-1)*s1^2 + (n2-1)*s2^2)/(n1+n2-2) )
df <- n1+n2-2
}
t <- (m1-m2-m0)/se
dat <- vapply(seq_len(length(m1)),
function(x){c(m1[x]-m2[x], se[x], t[x], 2*pt(-abs(t[x]),df[x]))},
numeric(4)) #one tailed m2 > m1. Replace with "2*pt(-abs(t),df))" for two tailed.
dat <- t(dat)
dat <- as.data.frame(dat)
names(dat) <- c("Difference of means", "Std Error", "t", "p-value")
return(dat)
}
This approach allows you to pass in vectors for your various inputs and it will provide a data frame of equal length to your inputs. It uses the vapply
function to return a vector of length 4 for each value provided.
Under this approach, you can simply go
t.test2(MPAmeans$reference_mean, MPAmeans$MPA_mean, MPAmeans$sd_reference, MPAmeans$sd_MPA, MPAmeans$n_reference, MPAmeans$n_MPA)
(or whatever you end up calling your variables)
Apply a function to each row in a data frame in R
You want apply
(see the docs for it). apply(var,1,fun)
will apply to rows, apply(var,2,fun)
will apply to columns.
> apply(a,1,min)
[1] 1 0 3
Apply a function to every row of a matrix or a data frame
You simply use the apply()
function:
R> M <- matrix(1:6, nrow=3, byrow=TRUE)
R> M
[,1] [,2]
[1,] 1 2
[2,] 3 4
[3,] 5 6
R> apply(M, 1, function(x) 2*x[1]+x[2])
[1] 4 10 16
R>
This takes a matrix and applies a (silly) function to each row. You pass extra arguments to the function as fourth, fifth, ... arguments to apply()
.
Apply function to a row in a data.frame using dplyr
We just need the data to be specified as .
as data.frame
is a list
with columns as list elements. If we wrap list(.)
, it becomes a nested list
library(dplyr)
d %>%
mutate(u = pmap_int(., ~ which.max(c(...))))
# a b c u
#1 1 4 2 2
#2 2 3 3 2
#3 3 2 4 3
#4 4 1 5 3
Or can use cur_data()
d %>%
mutate(u = pmap_int(cur_data(), ~ which.max(c(...))))
Or if we want to use everything()
, place that inside select
as list(everything())
doesn't address the data from which everything should be selected
d %>%
mutate(u = pmap_int(select(., everything()), ~ which.max(c(...))))
Or using rowwise
d %>%
rowwise %>%
mutate(u = which.max(cur_data())) %>%
ungroup
# A tibble: 4 x 4
# a b c u
# <int> <int> <int> <int>
#1 1 4 2 2
#2 2 3 3 2
#3 3 2 4 3
#4 4 1 5 3
Or this is more efficient with max.col
max.col(d, 'first')
#[1] 2 2 3 3
Or with collapse
library(collapse)
dapply(d, which.max, MARGIN = 1)
#[1] 2 2 3 3
which can be included in dplyr
as
d %>%
mutate(u = max.col(cur_data(), 'first'))
Apply a function returning a data frame to each row in a data frame
You haven't shown what you have in f
but based on comments it is written for dataframes, so this should work :
lapply(split(d, seq_len(nrow(d))), f)
split
divides every row of d
in 1 row-dataframe and using lapply
we apply function f
on each row.
You can also use by
:
by(d, seq_len(nrow(d)), f)
Call a custom function on each row of dataframe with multiple arguments from each row
In R
, and especially in your case, you can make use of vectorised functions. They work on the complete vector, so you don't have to apply the function separately for every row, but can directly supply the complete columns:
df <- data.frame(Name=c('John','Tom','Sarah'), Quantity=c(3,4,5), Price=c(5,6,7))
my_vectorised_fun <- function(name, quantity, price) {
sales <- quantity * price
# check for which the name doesn't fit
index_names <- !name %in% c("John", "Tom")
sales[index_names] <- NA
sales
}
library(dplyr)
df %>%
mutate(Sales = my_vectorised_fun(Name, Quantity, Price))
#> Name Quantity Price Sales
#> 1 John 3 5 15
#> 2 Tom 4 6 24
#> 3 Sarah 5 7 NA
Created on 2021-02-19 by the reprex package (v0.3.0)
Edit
Here is a version where you pass the complete .data
pronoun to the function and only have to specify the names in the function:
df <- data.frame(Name=c('John','Tom','Sarah'), Quantity=c(3,4,5), Price=c(5,6,7))
my_vectorised_fun <- function(all_data) {
sales <- all_data[["Quantity"]] * all_data[["Price"]]
# check for which the name doesn't fit
index_names <- !all_data[["Name"]] %in% c("John", "Tom")
sales[index_names] <- NA
sales
}
library(dplyr)
df %>%
mutate(Sales = my_vectorised_fun(.data))
#> Name Quantity Price Sales
#> 1 John 3 5 15
#> 2 Tom 4 6 24
#> 3 Sarah 5 7 NA
Created on 2021-02-19 by the reprex package (v0.3.0)
Related Topics
How to Convert .Rdata Format into Text File Format
Make R Studio Plots Only Show Up in New Window
Data.Table VS Plyr Regression Output
How to Retrieve the Most Repeated Value in a Column Present in a Data Frame
Ggplot2: How to Adjust Fill Colour in a Boxplot (And Change Legend Text)
What Is R's Crossproduct Function
Importing Wikipedia Tables in R
Linear Interpolate Missing Values in Time Series
Multiple Colors in a Facet Strip Background
Reshaping Several Variables Wide with Cast
S3 Method Consistency Warning When Building R Package with Roxygen
Check If Value Is in Data Frame
Shading Confidence Intervals Manually with Ggplot2
R: Split Elements of a List into Sublists
What Are Productive Ways to Debug Rcpp Compiled Code Loaded in R (On Os X Mavericks)
R/Ggplot2: Collapse or Remove Segment of Y-Axis from Scatter-Plot