Create Counter of Consecutive Runs of a Certain Value

Create counter within consecutive runs of certain values

Here's a way, building on Joshua's rle approach: (EDITED to use seq_len and lapply as per Marek's suggestion)

> (!x) * unlist(lapply(rle(x)$lengths, seq_len))
 [1] 0 1 0 1 2 3 0 0 1 2

UPDATE. Just for kicks, here's another way to do it, around 5 times faster:

cumul_zeros <- function(x)  {
  x <- !x
  rl <- rle(x)
  len <- rl$lengths
  v <- rl$values
  cumLen <- cumsum(len)
  z <- x
  # replace the 0 at the end of each zero-block in z by the 
  # negative of the length of the preceding 1-block....
  iDrops <- c(0, diff(v)) < 0
  z[ cumLen[ iDrops ] ] <- -len[ c(iDrops[-1],FALSE) ]
  # ... to ensure that the cumsum below does the right thing.
  # We zap the cumsum with x so only the cumsums for the 1-blocks survive:
  x*cumsum(z)
}

Try an example:

> cumul_zeros(c(1,1,1,0,0,0,0,0,1,1,1,0,0,1,1))
 [1] 0 0 0 1 2 3 4 5 0 0 0 1 2 0 0

Now compare times on a million-length vector:

> x <- sample(0:1, 1000000,T)
> system.time( z <- cumul_zeros(x))
   user  system elapsed 
   0.15    0.00    0.14 
> system.time( z <- (!x) * unlist( lapply( rle(x)$lengths, seq_len)))
   user  system elapsed 
   0.75    0.00    0.75

Moral of the story: one-liners are nicer and easier to understand, but not always the fastest!

Create counter within consecutive runs of values

You need to use sequence and rle:

> sequence(rle(as.character(dataset$input))$lengths)
 [1] 1 1 2 1 2 1 1 2 3 4 1 1

Create counter of consecutive runs of a certain value

SOG <- c(4,4,0,0,0,3,4,5,0,0,1,2,0,0,0)
#run length encoding:
tmp <- rle(SOG)
#turn values into logicals
tmp$values <- tmp$values == 0
#cumulative sum of TRUE values
tmp$values[tmp$values] <- cumsum(tmp$values[tmp$values])
#inverse the run length encoding
inverse.rle(tmp)
#[1] 0 0 1 1 1 0 0 0 2 2 0 0 3 3 3

Count and Assign Consecutive Occurrences of Variable

You can repeat the lengths argument lengths time in rle

with(rle(dataset$input), rep(lengths, lengths))
#[1] 1 2 2 2 2 1 4 4 4 4 1 1

Using dplyr, we can use lag to create groups and then count the number of rows in each group.

library(dplyr)

dataset %>%
  group_by(gr = cumsum(input != lag(input, default = first(input)))) %>%
  mutate(count = n())

and with data.table

library(data.table)
setDT(dataset)[, count:= .N, rleid(input)]

data

Make sure the input column is character and not factor.

dataset <- data.frame(input = c("a","b","b","a","a","c","a","a","a","a","b","c"),
           stringsAsFactors = FALSE)

Count consecutive occurrences of a specific value in every row of a data frame in R

You've identified the two cases that the longest run can take: (1) somewhere int he middle or (2) split between the end and beginning of each row. Hence you want to calculate each condition and take the max like so:

df<-cbind(
Winter=c(0,0,3),
Spring=c(0,2,4),
Summer=c(0,2,7),
Autumn=c(3,0,4))

#>      Winter Spring Summer Autumn
#> [1,]      0      0      0      3
#> [2,]      0      2      2      0
#> [3,]      3      4      7      4


# calculate the number of consecutive zeros at the start and end
startZeros  <-  apply(df,1,function(x)which.min(x==0)-1)
#> [1] 3 1 0
endZeros  <-  apply(df,1,function(x)which.min(rev(x==0))-1)
#> [1] 0 1 0

# calculate the longest run of zeros
longestRun  <-  apply(df,1,function(x){
                y = rle(x);
                max(y$lengths[y$values==0],0)}))
#> [1] 3 1 0

# take the max of the two values
pmax(longestRun,startZeros +endZeros  )
#> [1] 3 2 0

Of course an even easier solution is:

longestRun  <-  apply(cbind(df,df),# tricky way to wrap the zeros from the start to the end
                      1,# the margin over which to apply the summary function
                      function(x){# the summary function
                          y = rle(x);
                          max(y$lengths[y$values==0],
                              0)#include zero incase there are no zeros in y$values
                      })

Note that the above solution works because my df does not include the location field (column).

R: count consecutive occurrences of values in a single column and by group

Use rleid (from the data.table package) to get a grouping variable and then use ave to apply seq_along within common values of that grouping:

library(data.table)
transform(dataset, Counter = ave(YesNO, rleid(ID, YesNO), FUN = seq_along))

giving:

   ID YesNO Counter
1   a     1       1
2   a     1       2
3   a     0       1
4   a     0       2
5   a     0       3
6   a     1       1
7   a     1       2
8   b     1       1
9   b     1       2
10  b     1       3
11  b     0       1
12  b     0       2
13  b     0       3
14  b     0       4

Count consecutive occurences of an element in string

To put the pieces together: here's a combination of my comment on your previous question and (parts of) my answer here: Count consecutive TRUE values within each block separately. The convenience functions rleid and rowid from the data.table package are used.

Toy data with two strings of different length:

s <- c("a > a > b > b > b > a > b > b", "c > c > b > b > b > c > c")

library(data.table)
lapply(strsplit(s, " > "), function(x) paste0(x, rowid(rleid(x)), collapse = " > "))
# [[1]]
# [1] "a1 > a2 > b1 > b2 > b3 > a1 > b1 > b2"
# 
# [[2]]
# [1] "c1 > c2 > b1 > b2 > b3 > c1 > c2"

Calculate maximum length of consecutive values in row over a set number

# condition is that x should be larger or equal to 3
condition <- function(x) x >= 3

# example row
row = c(2,4,3,3,4,5,1,0,5,1)

# we can use condition on row:
condition(row)

# and we can emplay rle on that:
rle(condition(row))

# we need to filter those rle results for TRUE:
r <- rle(condition(row))
r$length[r$values == TRUE]

# The answer is the max of the latter
max(r$length[r$values])

or for your dataframe example

# condition is that x should be larger or equal to 3
condition <- \(x) x >= 3

 
number <- function(row, condition){
  r <- row |>
         condition() |>
         rle()
  max(r$length[r$values])
}

df <- replicate(10, sample(0:5, 10, rep=T))
apply(df, 1, number, condition)

Count the rows in a data table where a condition has been met consecutively

Counting consecutive occurrences (i.e. run length) of `b` for each `ID` through specified `update_date`

DT[order(ID, update_date), occurence := 1:.N, by = list(ID, rleid(b))]
DT
#>     update_date   ID  b occurence
#>  1:  2022-01-01 aapl U1         1
#>  2:  2022-01-02 aapl U1         2
#>  3:  2022-01-03 aapl U1         3
#>  4:  2022-01-04 aapl U2         1
#>  5:  2022-01-05 aapl U2         2
#>  6:  2022-01-06 aapl U2         3
#>  7:  2022-01-01  ibm D1         1
#>  8:  2022-01-02  ibm D2         1
#>  9:  2022-01-03  ibm D1         1
#> 10:  2022-01-04  ibm D3         1
#> 11:  2022-01-05  ibm D2         1
#> 12:  2022-01-06  ibm D3         1

Counting occurrences of `b` for each `ID` through specified `update_date`

This includes occurrences that are non-consecutive.

#  Count of occurrences through present row
DT[order(ID, b, update_date), occurence := 1:.N, by = list(ID, b)]
DT
#>     update_date   ID  b occurence
#>  1:  2022-01-01 aapl U1         1
#>  2:  2022-01-02 aapl U1         2
#>  3:  2022-01-03 aapl U1         3
#>  4:  2022-01-04 aapl U2         1
#>  5:  2022-01-05 aapl U2         2
#>  6:  2022-01-06 aapl U2         3
#>  7:  2022-01-01  ibm D1         1
#>  8:  2022-01-02  ibm D2         1
#>  9:  2022-01-03  ibm D1         2
#> 10:  2022-01-04  ibm D3         1
#> 11:  2022-01-05  ibm D2         2
#> 12:  2022-01-06  ibm D3         2

Create Counter of Consecutive Runs of a Certain Value

Create counter within consecutive runs of certain values

Create counter within consecutive runs of values

Create counter of consecutive runs of a certain value

Count and Assign Consecutive Occurrences of Variable

Count consecutive occurrences of a specific value in every row of a data frame in R

R: count consecutive occurrences of values in a single column and by group

Count consecutive occurences of an element in string

Calculate maximum length of consecutive values in row over a set number

Count the rows in a data table where a condition has been met consecutively

Counting consecutive occurrences (i.e. run length) of `b` for each `ID` through specified `update_date`

Counting occurrences of `b` for each `ID` through specified `update_date`

Related Topics

Leave a reply

Create counter within consecutive runs of certain values

Create counter within consecutive runs of values

Create counter of consecutive runs of a certain value

Count and Assign Consecutive Occurrences of Variable

Count consecutive occurrences of a specific value in every row of a data frame in R

R: count consecutive occurrences of values in a single column and by group

Count consecutive occurences of an element in string

Calculate maximum length of consecutive values in row over a set number

Count the rows in a data table where a condition has been met consecutively

Counting consecutive occurrences (i.e. run length) of b for each ID through specified update_date

Counting occurrences of b for each ID through specified update_date

Related Topics

Leave a reply

Counting consecutive occurrences (i.e. run length) of `b` for each `ID` through specified `update_date`

Counting occurrences of `b` for each `ID` through specified `update_date`