Subsetting a Dataframe for a Specified Month and Year

Subset specific dates (year and month) from data.frame

Here are some solutions. They (i) work with any dates, not just ones that are the first of the month, (ii) preserve the order of df2 in the output, (iii) are compact, i.e. one line each and do not require mentioning df2 multiple times.

1) substr This uses no packages.

subset(df2, substr(Date, 1, 7) %in% dates)

giving:

   ID       Date
2 2 1980-02-01
3 9 1980-02-01
5 4 1990-07-01
6 12 1990-07-01
7 16 1990-07-01
10 7 1993-09-01
11 67 1993-09-01

2) zoo::as.yearmon Another possibility is to convert both Date and dates to "yearmon" class giving the same result. This code is a bit nicer but does need a package.

library(zoo)
subset(df2, as.yearmon(Date) %in% as.yearmon(dates))

R - How to subset a dataframe by month?

To keep things simple, try saving the month as a new column and using that:

BG.data$month <- factor(format(data.new$timestamp, "%B"),
levels = month.name)

Then you can use this in for loops:

for (month in unique(BG.data$month)){
# get the subset
BG.subset <- BG.data[,BG.data$month == month]
# now do something with that subset
}

You can also use it in aggregate:

aggregate(something ~ month,
data = BG.data,
FUN = function(x){ # custom function })

and so-on.

Subset data frame based on condition of multiple months and years in R?

If you don't mind installing tidyverse package, you can use this simple filtering:

library(tidyverse)
library(lubridate) # should come with tidyverse, no need to install it separately

# filter July and September data in 2005 and 2006
output <- df %>%
filter(year(Date) %in% c(2005, 2006) & month(Date) %in% c(7, 9))

If you want to use base R, this should work as well:

output <- subset(df, format(Date, "%m") %in% c("07", "09") & format(Date, "%Y") %in% c("2005", "2006"))

in case that class of df$Date column is "Date".

Subset data between specific months over multiple years

Something like these could help:

library(lubridate)
library(dplyr)

df %>% filter(month(ymd(issue_date)) %in% c(4:6))

Subset dataframe in r for a specific month and date

The problem is that you didn't put quotation marks around 2017-01-01. Directly putting 2017-01-01 will compute the subtraction and return a number, and then you're comparing a string to a number. You can compare string to string; with string, "2" is still greater than "1", so it would work for comparing dates as strings. BTW, you don't need to write df$ when using filter; you can directly write the column names without quoting when using the tidyverse.

Pandas - Select month and year

Given that you have a Date column, I would suggest to first convert the column as you do it twice. You cannot apply .dt.month to the Series (whole column).
Then just apply it to the Series.

import datetime as dt
data['Date']= pd.to_datetime(data['Date'], dayfirst=True)
df = data[(data['Date'].apply(lambda x: x.month) == current_month) &
(data['Date'].apply(lambda y: y.year) == current_year)]


Related Topics



Leave a reply



Submit