How can I keep track of total transaction amount received by an account each last 6 month?
We can use map2_dbl
and take sum
of amount
that lie in the 6-month range.
library(dplyr)
library(purrr)
data %>%
mutate(amt = map2_dbl(from, date,
~sum(amount[to == .x & between(date, .y - 180, .y)])))
Sum amount last 6 month prior to the date of transaction
This is simply a non-equi join in data.table. You can create a variable of date - 180
and limit the join between the current date and that variable. This should be fairly quick
library(data.table)
setDT(dt)[, date_minus_180 := date - 180]
dt[, amnt_6_m := .SD[dt, sum(amount, na.rm = TRUE),
on = .(to = from, date <= date, date >= date_minus_180), by = .EACHI]$V1]
head(dt, 10)
# id from to date amount date_minus_180 amnt_6_m
# 1: 18529 5370 9356 2005-05-31 24.4 2004-12-02 0.0
# 2: 13742 5370 5605 2005-08-05 7618.0 2005-02-06 0.0
# 3: 9913 5370 8567 2005-09-12 21971.0 2005-03-16 0.0
# 4: 956 8605 5370 2005-10-05 5245.0 2005-04-08 0.0
# 5: 2557 5370 5636 2005-11-12 2921.0 2005-05-16 5245.0
# 6: 1602 6390 5370 2005-11-26 8000.0 2005-05-30 0.0
# 7: 18669 5370 8933 2005-11-30 169.2 2005-06-03 13245.0
# 8: 35900 5370 8483 2006-01-31 71.5 2005-08-04 13245.0
# 9: 48667 8934 5370 2006-03-31 14.6 2005-10-02 0.0
# 10: 51341 5370 7626 2006-04-11 4214.0 2005-10-13 8014.6
Finding the last transaction for each id
This how I might approach the problem with rle
:
L1 <- lapply(split(df, df[, "id"]), function(dat){
dat[, "last"] <- as.Date(NA)
x <- rle(as.character(dat[, "period"]))
z <- cumsum(x[["lengths"]])
dat$last[z[x[["values"]] == "calib"]] <- dat[z[x[["values"]] == "calib"] ,
"date2"]
dat
})
data.frame(do.call(rbind, L1), row.names = NULL)
How to select last 6 months from news table using MySQL
Use DATE_SUB
.... where yourdate_column > DATE_SUB(now(), INTERVAL 6 MONTH)
Extract first and last transaction date in R
We can use first
and last
from dplyr
after grouping by 'ID'
library(dplyr)
df1 %>%
group_by(ID) %>%
summarise(FIRST_PURCHASE_DATE = first(purchase_date),
LAST_PURCHASE_DATE = last(purchase_date))
The above assumes that 'purchase_date' is ordered by 'ID'. If it is not, arrange
after converting to Date
class and then take the first
and last
library(lubridate)
df1 %>%
arrange(ID, mdy(purchase_date)) %>%
group_by(ID) %>%
summarise(FIRST_PURCHASE_DATE = first(purchase_date),
LAST_PURCHASE_DATE = last(purchase_date))
I have data frame and I want to calculate the last three month transaction count and sum for each Group Id
Example
data = [['A', '2022-01-02', 10], ['A', '2022-01-02', 20], ['A', '2022-02-04', 30],
['A', '2022-02-05', 20], ['A', '2022-04-08', 300], ['A', '2022-04-11', 100],
['A', '2022-05-13', 200], ['A', '2022-06-12', 20], ['A', '2022-06-15', 300],
['A', '2022-08-16', 100], ['B', '2022-01-02', 10], ['B', '2022-01-02', 20],
['B', '2022-02-04', 30], ['B', '2022-02-05', 20], ['B', '2022-04-08', 300],
['B', '2022-04-11', 100], ['B', '2022-05-13', 200], ['B', '2022-06-12', 20],
['B', '2022-06-15', 300], ['B', '2022-08-16', 100]]
df1 = pd.DataFrame(data, columns=['Id', 'Date', 'TransAmt'])
df1
Id Date TransAmt
0 A 2022-01-02 10
1 A 2022-01-02 20
2 A 2022-02-04 30
3 A 2022-02-05 20
4 A 2022-04-08 300
5 A 2022-04-11 100
6 A 2022-05-13 200
7 A 2022-06-12 20
8 A 2022-06-15 300
9 A 2022-08-16 100
10 B 2022-01-02 10
11 B 2022-01-02 20
12 B 2022-02-04 30
13 B 2022-02-05 20
14 B 2022-04-08 300
15 B 2022-04-11 100
16 B 2022-05-13 200
17 B 2022-06-12 20
18 B 2022-06-15 300
19 B 2022-08-16 100
Code
s = df1['Date']
df1['Date'] = df1['Date'].astype('Period[M]')
df2 = df1.groupby(['Id', 'Date'])['TransAmt'].agg(['count', sum])
idx1 = pd.period_range(df1['Date'].min(), df1['Date'].max(), freq='M')
idx2 = pd.MultiIndex.from_product([df1['Id'].unique(), idx1])
cols = ['Id', 'Date', 'CountThreeMonth', 'AmountofThreeMonth']
n = 3
df3 = df2.reindex(idx2, fill_value=0).groupby(level=0).rolling(n, min_periods=1).sum().droplevel(0).reset_index().set_axis(cols, axis=1)
df1.merge(df3, how='left').assign(Date=s)
result(df1.merge(df3, how='left').assign(Date=s)
)
Id Date TransAmt CountThreeMonth AmountofThreeMonth
0 A 2022-01-02 10 2.0 30.0
1 A 2022-01-02 20 2.0 30.0
2 A 2022-02-04 30 4.0 80.0
3 A 2022-02-05 20 4.0 80.0
4 A 2022-04-08 300 4.0 450.0
5 A 2022-04-11 100 4.0 450.0
6 A 2022-05-13 200 3.0 600.0
7 A 2022-06-12 20 5.0 920.0
8 A 2022-06-15 300 5.0 920.0
9 A 2022-08-16 100 3.0 420.0
10 B 2022-01-02 10 2.0 30.0
11 B 2022-01-02 20 2.0 30.0
12 B 2022-02-04 30 4.0 80.0
13 B 2022-02-05 20 4.0 80.0
14 B 2022-04-08 300 4.0 450.0
15 B 2022-04-11 100 4.0 450.0
16 B 2022-05-13 200 3.0 600.0
17 B 2022-06-12 20 5.0 920.0
18 B 2022-06-15 300 5.0 920.0
19 B 2022-08-16 100 3.0 420.0
I'm sorry it's hard to explain
Is there an API to get bank transaction and bank balance?
Just a helpful hint, there is a company called Yodlee.com who provides this data. They do charge for the API. Companies like Mint.com use this API to gather bank and financial account data.
Also, checkout https://plaid.com/, they are a similar company Yodlee.com and provide both authentication API for several banks and REST-based transaction fetching endpoints.
Algorithm to share/settle expenses among a group
You have described it already. Sum all the expenses (1500 in your case), divide by number of people sharing the expense (500). For each individual, deduct the contributions that person made from the individual share (for person A, deduct 400 from 500). The result is the net that person "owes" to the central pool. If the number is negative for any person, the central pool "owes" the person.
Because you have already described the solution, I don't know what you are asking.
Maybe you are trying to resolve the problem without the central pool, the "bank"?
I also don't know what you mean by "start with the least spent amount and work forward."
Related Topics
Convert to Local Time Zone Using Latitude and Longitude
How to Add Multiple Columns to a Tibble
Select List Element Programmatically Using Name Stored as String
Using Dplyr to Group_By and Conditionally Mutate a Dataframe by Group
Using The Result of Summarise (Dplyr) to Mutate The Original Dataframe
Means from a List of Data Frames in R
Fast Alternative to Split in R
Download Multiple CSV Files with One Button (Downloadhandler) with R Shiny
How to Do Histograms of This Row-Column Table in R Ggplot
R Plotly: Cannot Re-Arrange X-Axis When Axis Type Is Category
Making Commandargs Comma Delimited or Parsing Spaces
How to Use Stat_Function by Group
Change Position of Tick Marks of a Single Graph, Using Ggplot2
Count Number of Values in Row Using Dplyr
What's a Prettier Way to Print Info with R
Importing Multiple .Csv Files into R and Adding a New Column with File Name