Generate Rows Between Two Dates into a Data Frame in R

Generate records in an R data frame between two dates

Another option using tidyverse

Input %>%
mutate(Frequency = case_when(Frequency == "Weekly" ~ 7L,
Frequency == "Fortnightly" ~ 14L,
Frequency == "Monthly" ~ 30L,
TRUE ~ 0L)) %>%
group_by(CusID) %>%
group_modify(~ {PaymentDate <- seq.Date(from = .x$FromDate, to = .x$ToDate, by = .x$Frequency)
crossing(.x[,1], PaymentDate)})

# A tibble: 7 x 3
# Groups: CusID [3]
CusID PaymentDate Amount
<dbl> <date> <dbl>
1 1 2019-01-01 5
2 1 2019-01-08 5
3 1 2019-01-15 5
4 2 2019-01-04 10
5 3 2019-02-02 12
6 3 2019-03-04 12
7 3 2019-04-03 12

Payment dates are a little different from your expected output because seq.Date adds 30 days taking into account the different number of days in those months.

UPDATE:

Here is a more verbatim solution

Input %>% 
mutate(PaymentDate = FromDate,
RFrequency = case_when(Frequency == "Weekly" ~ '1 week',
Frequency == "Fortnightly" ~ '2 weeks',
Frequency == "Monthly" ~ '1 month')) %>%
group_by(CusID, Amount) %>%
expand(PaymentDate = seq.Date(FromDate,ToDate, by = RFrequency))

# A tibble: 7 x 3
# Groups: CusID, Amount [3]
CusID Amount PaymentDate
<dbl> <dbl> <date>
1 1 5 2019-01-01
2 1 5 2019-01-08
3 1 5 2019-01-15
4 2 10 2019-01-04
5 3 12 2019-02-02
6 3 12 2019-03-02
7 3 12 2019-04-02

Generate dates between two dates in a dataframe

Assuming that the 'min_date/max_date' columns are Date class, we use Map to get the sequence of each 'min_date' with the corresponding 'max_date' in a list, replicate the sequence of rows of 'df1' with the number of rows of the list elements, create a data.frame by expanding the dataset based on 'i1' and get create 'dates' by concatenating the 'lst' elements.

lst <- Map(function(x, y) seq(x,y, by = "1 day"), df1$min_date, df1$max_date)
i1 <- rep(1:nrow(df1), lengths(lst))
data.frame(df1[i1,-3], dates = do.call("c", lst))

Or if we are using dplyr

library(dplyr)
df1 %>%
rowwise() %>%
do(data.frame(.[1:2], date = seq(.$min_date, .$max_date, by = "1 day")))

Or using data.table, we can do this in a single line of code

library(data.table) 
setDT(df1)[,.(date = seq(min_date, max_date, by = "1 day")) ,.(id1, id2)]

Generate rows between two dates into a data frame in R

Since you don't state otherwise, this answer assumes the stDte and edDte columns are both of "Date" class.

In base R you could use Map() to create the sequence of dates, then data.frame to bring the new data frame together after creating the new Name column with rep.int().

M <- Map(seq, df$stDte, df$edDte, by = "month")
df2 <- data.frame(
Name = rep.int(df$Name, vapply(M, length, 1L)),
Dte = do.call(c, M)
)
str(df2)
# 'data.frame': 65 obs. of 2 variables:
# $ Name: Factor w/ 2 levels "A","B": 1 1 1 1 1 1 1 1 1 1 ...
# $ Dte : Date, format: "2010-05-01" "2010-06-01" ...
head(df2, 3)
# Name Dte
# 1 A 2010-05-01
# 2 A 2010-06-01
# 3 A 2010-07-01
tail(df2, 3)
# Name Dte
# 63 B 2013-12-01
# 64 B 2014-01-01
# 65 B 2014-02-01

Or you can use the data.table package and do

library(data.table)
setDT(df)[, .(Dte = seq(stDte, edDte, by = "month")), by = Name]

How to fill dates between two dates

Here, we may need to use by as sequence of rows

library(data.table)
setDT(df)[, .(date = seq(as.Date(start_date), as.Date(end_date),
by = 'day')), .(rn = seq_len(nrow(df)), name, value)][, rn := NULL][]

Or create a list column by looping over corresponding elements of 'start_date', 'end_date' to create a sequence of dates in Map and then unnest the list

library(tidyr)
library(magrittr)
setDT(df)[, .(name, date = Map(seq, MoreArgs = list(by = '1 day'),
as.Date(start_date), as.Date(end_date)), value)] %>%
unnest(date)
# A tibble: 731 x 3
# name date value
# <chr> <date> <dbl>
# 1 A 2020-01-23 8.1
# 2 A 2020-01-24 8.1
# 3 A 2020-01-25 8.1
# 4 A 2020-01-26 8.1
# 5 A 2020-01-27 8.1
# 6 A 2020-01-28 8.1
# 7 A 2020-01-29 8.1
# 8 A 2020-01-30 8.1
# 9 A 2020-01-31 8.1
#10 A 2020-02-01 8.1
# … with 721 more rows

R: Insert rows into dataframe by date

Here is something you can try - let me know if this is helpful.

First, make sure your dates are Date format (if not already):

df$date <- as.Date(df$date, format = "%d-%m-%Y")
df$death <- as.Date(df$death, format = "%d-%m-%Y")

You can use tidyverse and lubridate for most functions needed. Also note that padr can be useful for padding dates.

In this case, first group_by id. You can use complete to fill in missing dates. The sequence of dates used will be from the first date to 89 days later (total 90 days). The fill will make the score zero for these additional dates added.

Another fill can be used to include the date of death for each subsequent row added (since it appears to be the structure used already).

Finally, you can mutate the score - if death is not NA and the date for a row comes after death, then make the value 20 (otherwise just retain the current score).

library(tidyverse)
library(lubridate)

df %>%
group_by(id) %>%
complete(date = seq(date[1], date[1] + days(89), by = "1 day"), fill = list(score = 0)) %>%
fill(death, .direction = "down") %>%
mutate(score = ifelse(!is.na(death) & date > death, 20, score)) %>%
print(n=180)

Output

# A tibble: 180 x 4
# Groups: id [2]
id date death score
<int> <date> <date> <dbl>
1 1 2001-01-01 2001-02-04 14
2 1 2001-01-02 2001-02-04 16
3 1 2001-01-03 2001-02-04 0
4 1 2001-01-04 2001-02-04 0
5 1 2001-01-05 2001-02-04 0
6 1 2001-01-06 2001-02-04 0
7 1 2001-01-07 2001-02-04 0
8 1 2001-01-08 2001-02-04 0
9 1 2001-01-09 2001-02-04 0
10 1 2001-01-10 2001-02-04 0
11 1 2001-01-11 2001-02-04 0
12 1 2001-01-12 2001-02-04 0
13 1 2001-01-13 2001-02-04 0
14 1 2001-01-14 2001-02-04 0
15 1 2001-01-15 2001-02-04 0
16 1 2001-01-16 2001-02-04 0
17 1 2001-01-17 2001-02-04 0
18 1 2001-01-18 2001-02-04 0
19 1 2001-01-19 2001-02-04 0
20 1 2001-01-20 2001-02-04 0
21 1 2001-01-21 2001-02-04 0
22 1 2001-01-22 2001-02-04 0
23 1 2001-01-23 2001-02-04 0
24 1 2001-01-24 2001-02-04 0
25 1 2001-01-25 2001-02-04 0
26 1 2001-01-26 2001-02-04 0
27 1 2001-01-27 2001-02-04 0
28 1 2001-01-28 2001-02-04 0
29 1 2001-01-29 2001-02-04 0
30 1 2001-01-30 2001-02-04 0
31 1 2001-01-31 2001-02-04 15
32 1 2001-02-01 2001-02-04 16
33 1 2001-02-02 2001-02-04 12
34 1 2001-02-03 2001-02-04 0
35 1 2001-02-04 2001-02-04 0
36 1 2001-02-05 2001-02-04 20
37 1 2001-02-06 2001-02-04 20
38 1 2001-02-07 2001-02-04 20
39 1 2001-02-08 2001-02-04 20
40 1 2001-02-09 2001-02-04 20
41 1 2001-02-10 2001-02-04 20
42 1 2001-02-11 2001-02-04 20
43 1 2001-02-12 2001-02-04 20
44 1 2001-02-13 2001-02-04 20
45 1 2001-02-14 2001-02-04 20
46 1 2001-02-15 2001-02-04 20
47 1 2001-02-16 2001-02-04 20
48 1 2001-02-17 2001-02-04 20
49 1 2001-02-18 2001-02-04 20
50 1 2001-02-19 2001-02-04 20
51 1 2001-02-20 2001-02-04 20
52 1 2001-02-21 2001-02-04 20
53 1 2001-02-22 2001-02-04 20
54 1 2001-02-23 2001-02-04 20
55 1 2001-02-24 2001-02-04 20
56 1 2001-02-25 2001-02-04 20
57 1 2001-02-26 2001-02-04 20
58 1 2001-02-27 2001-02-04 20
59 1 2001-02-28 2001-02-04 20
60 1 2001-03-01 2001-02-04 20
61 1 2001-03-02 2001-02-04 20
62 1 2001-03-03 2001-02-04 20
63 1 2001-03-04 2001-02-04 20
64 1 2001-03-05 2001-02-04 20
65 1 2001-03-06 2001-02-04 20
66 1 2001-03-07 2001-02-04 20
67 1 2001-03-08 2001-02-04 20
68 1 2001-03-09 2001-02-04 20
69 1 2001-03-10 2001-02-04 20
70 1 2001-03-11 2001-02-04 20
71 1 2001-03-12 2001-02-04 20
72 1 2001-03-13 2001-02-04 20
73 1 2001-03-14 2001-02-04 20
74 1 2001-03-15 2001-02-04 20
75 1 2001-03-16 2001-02-04 20
76 1 2001-03-17 2001-02-04 20
77 1 2001-03-18 2001-02-04 20
78 1 2001-03-19 2001-02-04 20
79 1 2001-03-20 2001-02-04 20
80 1 2001-03-21 2001-02-04 20
81 1 2001-03-22 2001-02-04 20
82 1 2001-03-23 2001-02-04 20
83 1 2001-03-24 2001-02-04 20
84 1 2001-03-25 2001-02-04 20
85 1 2001-03-26 2001-02-04 20
86 1 2001-03-27 2001-02-04 20
87 1 2001-03-28 2001-02-04 20
88 1 2001-03-29 2001-02-04 20
89 1 2001-03-30 2001-02-04 20
90 1 2001-03-31 2001-02-04 20
91 2 2001-04-01 NA 5
92 2 2001-04-02 NA 0
93 2 2001-04-03 NA 6
94 2 2001-04-04 NA 0
95 2 2001-04-05 NA 0
96 2 2001-04-06 NA 0
97 2 2001-04-07 NA 0
98 2 2001-04-08 NA 0
99 2 2001-04-09 NA 0
100 2 2001-04-10 NA 0
101 2 2001-04-11 NA 0
102 2 2001-04-12 NA 0
103 2 2001-04-13 NA 0
104 2 2001-04-14 NA 0
105 2 2001-04-15 NA 0
106 2 2001-04-16 NA 0
107 2 2001-04-17 NA 0
108 2 2001-04-18 NA 0
109 2 2001-04-19 NA 0
110 2 2001-04-20 NA 0
111 2 2001-04-21 NA 0
112 2 2001-04-22 NA 0
113 2 2001-04-23 NA 0
114 2 2001-04-24 NA 0
115 2 2001-04-25 NA 0
116 2 2001-04-26 NA 0
117 2 2001-04-27 NA 0
118 2 2001-04-28 NA 0
119 2 2001-04-29 NA 0
120 2 2001-04-30 NA 0
121 2 2001-05-01 NA 0
122 2 2001-05-02 NA 0
123 2 2001-05-03 NA 0
124 2 2001-05-04 NA 0
125 2 2001-05-05 NA 9
126 2 2001-05-06 NA 2
127 2 2001-05-07 NA 10
128 2 2001-05-08 NA 0
129 2 2001-05-09 NA 0
130 2 2001-05-10 NA 0
131 2 2001-05-11 NA 0
132 2 2001-05-12 NA 0
133 2 2001-05-13 NA 0
134 2 2001-05-14 NA 0
135 2 2001-05-15 NA 0
136 2 2001-05-16 NA 0
137 2 2001-05-17 NA 0
138 2 2001-05-18 NA 0
139 2 2001-05-19 NA 0
140 2 2001-05-20 NA 0
141 2 2001-05-21 NA 0
142 2 2001-05-22 NA 0
143 2 2001-05-23 NA 0
144 2 2001-05-24 NA 0
145 2 2001-05-25 NA 0
146 2 2001-05-26 NA 0
147 2 2001-05-27 NA 0
148 2 2001-05-28 NA 0
149 2 2001-05-29 NA 0
150 2 2001-05-30 NA 0
151 2 2001-05-31 NA 0
152 2 2001-06-01 NA 0
153 2 2001-06-02 NA 0
154 2 2001-06-03 NA 0
155 2 2001-06-04 NA 0
156 2 2001-06-05 NA 0
157 2 2001-06-06 NA 0
158 2 2001-06-07 NA 0
159 2 2001-06-08 NA 0
160 2 2001-06-09 NA 0
161 2 2001-06-10 NA 0
162 2 2001-06-11 NA 0
163 2 2001-06-12 NA 0
164 2 2001-06-13 NA 0
165 2 2001-06-14 NA 0
166 2 2001-06-15 NA 0
167 2 2001-06-16 NA 0
168 2 2001-06-17 NA 0
169 2 2001-06-18 NA 0
170 2 2001-06-19 NA 0
171 2 2001-06-20 NA 0
172 2 2001-06-21 NA 0
173 2 2001-06-22 NA 0
174 2 2001-06-23 NA 0
175 2 2001-06-24 NA 0
176 2 2001-06-25 NA 0
177 2 2001-06-26 NA 0
178 2 2001-06-27 NA 0
179 2 2001-06-28 NA 0
180 2 2001-06-29 NA 0

R Create new rows in data frame from given dates between two time points

So I modified the initial code slightly :

library(lubridate)

library(data.table)

ID1 <- c("AUT","AUT","AUT","BEL","BEL","BEL")

start_date <- ymd("2008-12-02", "2013-12-16", "2016-05-17", "2007-06-10", "2007-12-21", "2008-03-20")

end_date <- ymd("2013-12-15", "2016-05-16", "2017-11-30", "2007-12-20", "2008-03-19", "2008-12-29")

ID2 <- 1:6

df <- data.table(ID1, start_date, end_date, ID2)

df[,yearDiff:=year(end_date)-year(start_date)]

df<-df[,cbind(.SD,year=(year(start_date)+1):year(end_date)),by="ID2"]

df[,dateInterval:=interval(df$start_date,df$end_date)]

df[,IsYearWithinDate:=(ymd(paste0(year,"01","01",sep="-"))%within% dateInterval)]

df[,.(ID1,start_date,end_date,ID2,year,IsYearWithinDate)]

resulting in (via df[,.(ID1,start_date,end_date,ID2,year):

   ID1 start_date   end_date ID2 year
1: AUT 2008-12-02 2013-12-15 1 2009
2: AUT 2008-12-02 2013-12-15 1 2010
3: AUT 2008-12-02 2013-12-15 1 2011
4: AUT 2008-12-02 2013-12-15 1 2012
5: AUT 2008-12-02 2013-12-15 1 2013
6: AUT 2013-12-16 2016-05-16 2 2014
7: AUT 2013-12-16 2016-05-16 2 2015
8: AUT 2013-12-16 2016-05-16 2 2016
9: AUT 2016-05-17 2017-11-30 3 2017
10: BEL 2007-06-10 2007-12-20 4 2008
11: BEL 2007-06-10 2007-12-20 4 2007
12: BEL 2007-12-21 2008-03-19 5 2008
13: BEL 2008-03-20 2008-12-29 6 2009
14: BEL 2008-03-20 2008-12-29 6 2008

insert rows between dates by group

By using dplyr and tidyr :)(one line solution )

library(dplyr)
library(tidyr)
dt %>% group_by(user) %>% complete(date=full_seq(date,1),fill=list(dummy=0))
# A tibble: 9 x 3
# Groups: user [2]
user date dummy
<fctr> <date> <dbl>
1 A 2017-01-03 1
2 A 2017-01-04 0
3 A 2017-01-05 0
4 A 2017-01-06 1
5 B 2016-05-01 1
6 B 2016-05-02 0
7 B 2016-05-03 1
8 B 2016-05-04 0
9 B 2016-05-05 1


Related Topics



Leave a reply



Submit