Counting number of instances of a condition per row R
You can use rowSums
.
df$no_calls <- rowSums(df == "nc")
df
# rsID sample1 sample2 sample3 sample1304 no_calls
#1 abcd aa bb nc nc 2
#2 efgh nc nc nc nc 4
#3 ijkl aa ab aa nc 1
Or, as pointed out by MrFlick, to exclude the first column from the row sums, you can slightly modify the approach to
df$no_calls <- rowSums(df[-1] == "nc")
Regarding the row names: They are not counted in rowSums
and you can make a simple test to demonstrate it:
rownames(df)[1] <- "nc" # name first row "nc"
rowSums(df == "nc") # compute the row sums
#nc 2 3
# 2 4 1 # still the same in first row
Count occurrences of value in a set of variables in R (per row)
Try
apply(df,MARGIN=1,table)
Where df
is your data.frame
. This will return a list of the same length of the amount of rows in your data.frame. Each item of the list corresponds to a row of the data.frame (in the same order), and it is a table where the content is the number of occurrences and the names are the corresponding values.
For instance:
df=data.frame(V1=c(10,20,10,20),V2=c(20,30,20,30),V3=c(20,10,20,10))
#create a data.frame containing some data
df #show the data.frame
V1 V2 V3
1 10 20 20
2 20 30 10
3 10 20 20
4 20 30 10
apply(df,MARGIN=1,table) #apply the function table on each row (MARGIN=1)
[[1]]
10 20
1 2
[[2]]
10 20 30
1 1 1
[[3]]
10 20
1 2
[[4]]
10 20 30
1 1 1
#desired result
Counting number of rows if certain conditions are met
Try this:
library(dplyr)
df_count <- df %>% summarise(con1 = sum(B < 0 & C < 0),
con2 = sum(B > 0 & C > 0),
con3 = sum(B < 0 & C > 0),
con4 = sum(B > 0 & C < 0))
df_count
con1 con2 con3 con4
2 2 0 2
count the number of columns for each row by condition on character and missing
You could use rowSums
to count number of NA
s or empty values in each row and then subtract it from number of columns in the dataframe.
test$num <- ncol(test) - rowSums(is.na(test) | test == "")
test
# a b c d num
#1 aa aa aa 3
#2 bb <NA> bb 2
#3 cc aa <NA> 2
#4 dd <NA> <NA> 1
#5 cc cc 2
#6 <NA> dd dd dd 3
R function that counts rows where conditions are met
We can use rowSums
by making the vector c(1, 8, 4)
length same as the 'Task' columns length and do a ==
, and get the rowSums
i1 <- startsWith(names(df1), 'Task')
df1$COUNT <- rowSums(df1[i1] == c(1, 8, 4)[col(df1[i1])])
df1$COUNT
#[1] 1 1 2 1 3
Or with sweep
rowSums(sweep(df1[i1], 2, c(1, 8, 4), `==`))
Or another option is apply
df1$COUNT <- apply(df1[i1], 1, function(x) sum(x == c(1, 8, 4)))
NOTE: None of the solutions require any external package
data
df1 <- data.frame(Participant = 1:5, Task1 = c(4, 3, 1, 5, 1),
Task2 = c(8, 8, 3, 6, 8), Task3 = c(1, 7, 4, 4, 4))
Count number of rows that fulfill multiple conditions in R
Depends on what you're measure of efficiency is but
sum(df$Ethnicity== 'Asian' & df$Set == 3)
R: count times per column a condition is met and row names appear in a list
We may do this with rowwise
library(dplyr)
df2 %>%
rowwise %>%
mutate(x = +(sum(df1[[rownames]][df1$rownames %in% x]) >= 5),
y = +(sum(df1[[rownames]][df1$rownames %in% y]) >= 5)) %>%
ungroup
-output
# A tibble: 3 × 5
rownames batch totalcount x y
<chr> <chr> <int> <int> <int>
1 sample1 a 10 1 0
2 sample2 b 15 1 1
3 sample3 a 6 0 1
Or based on the data, a base R
option would be
out <- aggregate(. ~ grp, FUN = sum,
transform(df1, grp = c('x', 'y')[1 + (rownames %in% y)] )[-1])
df2[out$grp] <- +(t(out[-1]) >= 5)
-output
> df2
rownames batch totalcount x y
1 sample1 a 10 1 0
2 sample2 b 15 1 1
3 sample3 a 6 0 1
data
df1 <- structure(list(rownames = c("m1", "m2", "m3", "m4"), sample1 = c(0L,
1L, 6L, 3L), sample2 = c(5L, 7L, 2L, 1L), sample3 = c(1L, 5L,
0L, 0L)), class = "data.frame", row.names = c(NA, -4L))
df2 <- structure(list(rownames = c("sample1", "sample2", "sample3"),
batch = c("a", "b", "a"), totalcount = c(10L, 15L, 6L)),
class = "data.frame", row.names = c(NA,
-3L))
Count occurrence of string values per row in dataframe in R (dplyr)
You can use across
with rowSums
-
library(dplyr)
df %>% mutate(d9 = rowSums(across(all_of(cols), `%in%`, bcde)))
# d1 d2 d3 d4 d5 d6 d7 d8 d9
# <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <dbl>
#1 b a a a a a a a 0
#2 a a a a c a a a 1
#3 a b a a a a a a 1
#4 a a c a a b a a 2
#5 a a a a a a a a 0
#6 a a b a a a a a 1
#7 a a a a a d a a 1
#8 a a a d a a a a 1
This can also be written in base R -
df$d9 <- rowSums(sapply(df[cols], `%in%`, bcde))
Count number of columns by a condition ( ) for each row
This will give you the vector you are looking for:
rowSums(data > 30)
It will work whether data
is a matrix or a data.frame. Also, it uses vectorized functions, hence is a preferred approach over using apply
which is little more than a (slow) for loop.
If data
is a data.frame, you can add the result as a column by doing:
data$yr.above <- rowSums(data > 30)
or if data
is a matrix:
data <- cbind(data, yr.above = rowSums(data > 30))
You can also create a whole new data.frame:
data.frame(yr.above = rowSums(data > 30))
or a whole new matrix:
cbind(yr.above = rowSums(data > 30))
Related Topics
Join Two Data Frames in R Based on Closest Timestamp
Rstudio Shiny List from Checking Rows in Datatables
Stumped on How to Scrape the Data from This Site (Using R)
Install.Packages Fails in Knitr Document: "Trying to Use Cran Without Setting a Mirror"
R: How to Run Some Code on Load of Package
Convert Character Matrix into Numeric Matrix
Sparse Matrix to a Data Frame in R
Standard Deviation in R Seems to Be Returning the Wrong Answer - am I Doing Something Wrong
How to Change the Figure Caption Format in Bookdown
Ggplot Replace Count with Percentage in Geom_Bar
Exporting Non-S3-Methods with Dots in the Name Using Roxygen2 V4
R Grep: Is There an and Operator
How to Make Variable Bar Widths in Ggplot2 Not Overlap or Gap