Cumulative Minimum Value by Group

Cumulative minimum value by group

We could use cummin function by group

data$output <- with(data, ave(Target, Group, FUN = cummin))

data
#  Group Target output
#1     A      1      1
#2     A      0      0
#3     A      5      0
#4     A      0      0
#5     B      3      3
#6     B      5      3
#7     B      1      1
#8     B      3      1

whose dplyr and data.table equivalents are

library(dplyr)
data %>%
  group_by(Group) %>%
  mutate(output = cummin(Target))

library(data.table)
setDT(data)[, output := cummin(Target), by = (Group)]

Grouping by sequence and then finding minimum value in a column

We create the grouping variable by taking the cumulative sum where sequence is 1, then filter only the groups with 9 elements, and slice the rows where the 'low' is minimum after arrangeing the 'date' in descending order to take care of cases where there are ties for the 'low'est value

df %>% 
   group_by(group = cumsum(sequence == 1)) %>% 
   filter(n() == 9) %>% 
   select(date, low) %>%
   arrange(desc(date)) %>%
   slice(which.min(low)) %>%
   ungroup %>%
   select(-group)
# A tibble: 3 x 2
#  date         low
#  <date>     <dbl>
#1 2019-01-04    18
#2 2019-01-14     1
#3 2019-02-03     9

Or similar option with data.table

library(data.table)
setDT(df)[, .SD[.N == 9], .(group = cumsum(sequence == 1))
          ][order(-date), .SD[which.min(low)], group]

Find the lowest value and its index until certain row within group

You can use cummin to get minimum value until that row and use match to get the index.

library(dplyr)
a %>%
  group_by(subject) %>%
  mutate(min = cummin(value), 
         visit_min = match(min, unique(value)))

#   subject visit value   min visit_min
#     <dbl> <dbl> <dbl> <dbl>     <int>
# 1       1     1   100   100         1
# 2       1     2    97    97         2
# 3       1     3   120    97         2
# 4       1     4    84    84         4
# 5       1     5    60    60         5
# 6       1     6   150    60         5
# 7       1     7   150    60         5
# 8       2     1   160   160         1
# 9       2     2   100   100         2
#10       2     3    70    70         3
#11       2     4    40    40         4
#12       2     5   120    40         4

How Do I Find the Minimum Value of a Running Total by Group In SQL?

Syntactically, your first query cannot have been compilable in MS Access as the RunningTotal correlated aggregate expressoin should be included in GROUP BY clause. Practically though, correlated subqueries may not be allowed in GROUP BY and you also do not want to group by this rank calculation during aggregation. Also, in MS Access running complex operations on an underlying UNION query does yield performance issues.

Consider the following setup:

Convert underlying union query to temp table:
```
SELECT * INTO mytemptable FROM myUnionQuery
```

Calculate running total (quasi aggregation).

SELECT t.[ItemCode], t.[WarehouseCode], t.[Date], 
       (SELECT SUM(sub.[Qty]) FROM myTempTable sub
        WHERE sub.[ItemCode] = t.[ItemCode] 
          AND sub.[WarehouseCode] = t.[WarehouseCode]
          AND sub.[Date] <= t.[Date]) AS RunningQtyTotal
FROM myTempTable t;

Aggregate at [ItemCode] and [WarehouseCode] levels:

SELECT q.[ItemCode], q.[WarehouseCode], MIN(q.[RunningQtyTotal]) AS MinRunQtyTotal
FROM mySelectQuery q
GROUP BY q.[ItemCode], q.[WarehouseCode]

How to find the minimum value in between two values using R

You can use cummax to calculate cumulative maximum until that row and create a new group whenever the current row's value is greater than previous cummax value. Within each group you can return the minimum value.

library(dplyr)

df %>%
  group_by(group = cumsum(V1 > lag(cummax(V1), default = first(V1)))) %>%
  summarise(min_value = min(V1))

#  group min_value
#  <int>     <int>
#1     0      -444
#2     1        -6
#3     2       352
#4     3       472

This considers the last part as another group hence also returns minimum value in that part. You can remove the last row if it is not needed.

To apply for multiple columns, we can write a function and call it with lapply :

apply_fun <- function(data, col) {

col1 <- sym(col)
df %>%
  group_by(group = cumsum(!!col1 > lag(cummax(!!col1), 
           default = first(!!col1)))) %>%
  summarise(min_value = min(!!col1))
}

result <- lapply(names(df), apply_fun, data = df)

data

df <- structure(list(V1 = c(0L, -80L, -160L, -160L, -160L, -160L, -160L, 
-347L, -351L, -351L, -444L, -444L, -444L, 43L, 43L, 10L, 10L, 
-6L, 20L, 352L, 352L, 352L, 528L, 528L, 511L, 511L, 518L, 472L
)), class = "data.frame", row.names = c(NA, -28L))

Cumulative minimum and maximum by group

[Original answer: dplyr]

You can solve this problem by using the dplyr package:

library(dplyr)
d %>% 
  group_by(Key) %>% 
  mutate(Pmin=cummin(PriceEUR),Pmax=cummax(PriceEUR))

#   Key DaysToEvent PriceEUR Pmin Pmax
# 1 AAA         120       50   50   50
# 2 AAA         110       40   40   50
# 3 AAA         100       60   40   60
# 4 BBB         100       50   50   50

where d is supposed to be your data set:

d <- data.frame(Key=c('AAA','AAA','AAA','BBB'),DaysToEvent = c(120,110,100,100),PriceEUR = c(50,40,60,50), Pmin = c(50,40,40,30), Pmax = c(50,50,60,70))

[Update: data.table]

Another approach is to use data.table, which has a quite spectacular performance:

library(data.table)
DT <- setDT(d)
DT[,c("Pmin","Pmax") := list(cummin(PriceEUR),cummax(PriceEUR)),by=Key]

DT
#    Key DaysToEvent PriceEUR Pmin Pmax
# 1: AAA         120       50   50   50
# 2: AAA         110       40   40   50
# 3: AAA         100       60   40   60
# 4: BBB         100       50   50   50

[Update 2: base R]

Here is another approach in the case that you'd like to use only base R for some reason:

d$Pmin <- unlist(lapply(split(d$PriceEUR,d$Key),cummin))
d$Pmax <- unlist(lapply(split(d$PriceEUR,d$Key),cummax))

Cumulative sum in R by group and start over when sum of values in group larger than maximum value

One purrr approach could be:

cumsum(c(FALSE, diff(accumulate(test, ~ ifelse(.x >= 10, .y, .x + .y))) <= 0))

[1] 0 0 1 1 1 2 2 2 3

Group according to cumulative sums

Two possible one-liners, with purrr::accumulate and with MESS::cumsumbinning:

purrr::accumulate

library(tidyverse)
group_by(input, grp = LETTERS[cumsum(value == accumulate(value, ~ ifelse(.x + .y <= 100, .x + .y, .y)))])

MESS::cumsumbinning

library(dplyr)
group_by(input, grp = LETTERS[MESS::cumsumbinning(value, 100)])

output

# A tibble: 6 x 3
# Groups:   grp [3]
     id value grp  
  <int> <dbl> <chr>
1     1    99 A    
2     2     1 A    
3     3    33 B    
4     4    33 B    
5     5    33 B    
6     6   150 C

Cumulative Minimum Value by Group