Return data subset time frames within another timeframes?
You can use the .index*
family of functions to get certain months or certain days of the month. See ?index
for the full list of functions. For example:
library(quantmod)
getSymbols("SPY")
SPY[.indexmon(SPY)==0] # January for all years (note zero-based indexing!)
SPY[.indexmday(SPY)==1] # The first of every month
SPY[.indexwday(SPY)==1] # All Mondays
subset list of xts objects
You can use lapply
to loop over all the elements in your list, and use an anonymous function to subset them.
lapply(xts_list, function(x) x["2011/"])
Subsetting data frames in R
The reason must be in different treatment of NA values by these two methods. If you remove rows with NA from the data frame you should get the same results:
dat_clean = na.omit(dat)
Extract subset of multiple time series
We create a unique group_indices()
by group
and x
, then we filter groups that have fewer than 3 observations and row_number()
s of observations where x != 1
that are %in%
the range n()
(group size) to n()-2
to keep only the 3 observations prior to a change of x
occuring.
library(dplyr)
df %>%
mutate(g = group_indices_(., .dots = c("group", "x"))) %>%
group_by(g) %>%
mutate(condition = ifelse(x == 1, NA, row_number())) %>%
filter(n() >= 3, ifelse(is.na(condition), TRUE, condition %in% n():(n()-2)))
Which gives:
#Source: local data frame [13 x 5]
#Groups: g [4]
#
# group x time g condition
# <int> <int> <int> <int> <int>
#1 1 0 1636 1 1
#2 1 0 1637 1 2
#3 1 0 1638 1 3
#4 1 1 1639 2 NA
#5 1 1 1640 2 NA
#6 1 1 1641 2 NA
#7 1 1 1642 2 NA
#8 2 0 1686 3 4
#9 2 0 1687 3 5
#10 2 0 1688 3 6
#11 2 1 1689 4 NA
#12 2 1 1690 4 NA
#13 2 1 1691 4 NA
You can optionally remove the g
and condition
columns by adding select(-(g:condition))
to the chain.
Data
df <- structure(list(group = c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L), x = c(0L, 0L, 0L, 1L,
1L, 1L, 1L, 0L, 0L, 0L, 0L, 0L, 0L, 1L, 1L, 1L, 0L, 1L, 1L),
time = c(1636L, 1637L, 1638L, 1639L, 1640L, 1641L, 1642L,
1683L, 1684L, 1685L, 1686L, 1687L, 1688L, 1689L, 1690L, 1691L,
1638L, 1639L, 1640L)), .Names = c("group", "x", "time"),
class = "data.frame", row.names = c(NA, -19L))
Using apply to run functions over subset of time series
Put your data in the long format using reshape2
then apply ddply
from plyr for each region.
library(reshape2)
dat.m <- melt(dat,id.vars=c('date','province'))
library(plyr)
ddply(dat.m,.(province),function(ts){
## each ts looks like this (here for alpha)
## you can process it
# date province variable value
# 1 2014-09-21 region1 alpha 0.3981059
# 2 2015-01-06 region1 alpha -0.6120264
})
Dplyr grouped percentages in different timeframes
We can create a function to do the calculation
library(dplyr)
library(purrr)
f1 <- function(data) {
data %>%
filter(ELIGIBLE == 1 ) %>%
group_by(GROUP) %>%
transmute(count_Eligible = sum(ELIGIBLE == 1 ),
count_events = sum(EVENT == 1 ),
Percentage = round(100*count_events/count_Eligible,2))
}
Then, loop over the 'lookback' periods, subset the data based on the 'DATE' column and apply the function
map2_dfr(list(three_month_lookback, six_month_lookback,
one_year_lookback) list(today(), three_month_lookback, today()),
~ data %>%
mutate(DATE = as.Date(DATE)) %>%
filter(DATE >= .x, DATE <= .y) %>%
f1(.), .id = 'grp'
)
If we need to combine by columns
map2(list(three_month_lookback, six_month_lookback,
one_year_lookback) list(today(), three_month_lookback, today()),
~ data %>%
mutate(DATE = as.Date(DATE)) %>%
filter(DATE >= .x, DATE <= .y) %>%
f1(.)
) %>%
reduce(full_join, by = "GROUP")
Related Topics
Passing Several Arguments to Fun of Lapply (And Others *Apply)
Pattern Matching Using a Wildcard
Ggplot2:Plot Mean with Geom_Bar
How to Change the Figure Caption Format in Bookdown
Is There a _Fast_ Way to Run a Rolling Regression Inside Data.Table
Error in New.Session():Could Not Establish Session After 5 Attempts
R Ifelse Avoiding Change in Date Format
How to Group Data.Table by Multiple Columns
Overlay Two Ggplot2 Stat_Density2D Plots with Alpha Channels
Import Data into R with an Unknown Number of Columns
Ggplot Replace Count with Percentage in Geom_Bar
Plot a Line Chart with Conditional Colors Depending on Values
Change Day of the Month in a Date to First Day (01)