R How Many Element Satisfy a Condition

R how many element satisfy a condition?

If z consists of only TRUE or FALSE, then simply

length(which(z))

How to count how many elements satisfy a condition in an idiomatic way?

Suggestion 1: A slightly more idiomatic way would be to replace

length(data[data <= myoffsets[i]])

with

sum(data <= myoffsets[i])

This way you don't end up taking a subset of data for each value in myoffsets, only to compute its length and discard.

Suggestion 2: The c() in the for is redundant. The following would do exactly the same with fewer keystrokes: for(i in 1:length(myoffsets)).

Lastly, if you prefer to get rid of the explicit loop, something like this might be to your taste:

myres$x <- sapply(myoffsets, function(o)sum(data<=o))

summarize count with a condition

Just change the last line:

df %>%
group_by(year) %>%
summarize(number_quadrats = n(), # find total number of rows
          average_count = mean(count_numeric, na.rm=T),# find average value
          number_p = sum(count == "p"))

By summing a boolean vector, you are essentially counting the number of times the condition is met.

What's the fastest way to find the number of elements that satisfy a condition?

Firstly, pre-compute values you use more than once, for example, r**2. With r=1000, this made it about 30% faster for me (~1.1s → ~.85s total run time).

r_sq = r**2

Next, to save memory, don't make a list of a filter when all you need to know is its length. Instead, sum over a map, or better yet, a generator expression:

q = sum(x**2 + y**2 <= r_sq for x, y in quardrant)
return (4*q - 4*r + 4) / r_sq

This saves a bit of time by not constructing a list, but as a bonus, using unpacking instead of indexing also saves a surprising amount of time -- about 7% for me (~.74s → ~.69s total run time).

Next, coming back to the first point, if you think about it, you're getting x and y from a product, which means you're calculating the square of each number 0..r, 2*r times. It'd be faster to calculate the squares ahead of time.

quardrant_sq = product((x**2 for x in range(r+1)), repeat=2)
q = sum(a+b <= r_sq for a, b in quardrant_sq)

This gives a massive improvement, about 250% faster! (~.66s → ~.19s total run time).

Lastly, since you're dealing with only numbers, you could look into using NumPy to further optimize your code.

Remove elements from a list by condition

We need to extract the column within the loop. LDF is a list of data.frame/tibble, thus LDF$Value doesn't exist

i1 <- sapply(LDF, function(x) sum(x$Value)) > 0
LDF[i1]

-output

[[1]]
# A tibble: 18 x 2
   Date           Value
   <date>         <dbl>
 1 2021-05-18  120000  
 2 2021-05-20   40000  
 3 2021-05-31   55000  
 4 2021-05-31     -11.4
 5 2021-06-01 -115092. 
 6 2021-06-09   30000  
 7 2021-06-17   98400  
 8 2021-07-01    1720  
 9 2021-07-01   50000  
10 2021-07-01  -50063. 
11 2021-07-12   -2503. 
12 2021-07-13  -20022. 
13 2021-08-09   28619. 
14 2021-08-25   45781. 
15 2021-09-01   14954. 
16 2021-09-10   -6017. 
17 2021-09-15   -3311. 
18 2021-09-16 -140373.

To check the elements that are deleted, negate (!) the logical vector and check

which(!i1)

gives the position

LDF[!i1]

Or may use Filter as well

Filter(\(x) sum(x$Value) >0, LDF)

Or with keep from purrr

library(purrr)
keep(LDF, ~ sum(.x$Value) > 0)

Or the opposite is discard

discard(LDF, ~ sum(.x$Value) > 0)

Return table with count of elements that match a condition

Subset the data for "PASS" value and then use table :

temp <- subset(df, Outcome == 'PASS')
table(temp$Participant, temp$Trial)

#      T01 T02 T03
#  P01   2   0   1
#  P02   1   2   1
#  P03   0   2   1

R How Many Element Satisfy a Condition