How to Do Date Math That Ignores the Year

How do you do date math that ignores the year?

TL/DR: use the "Black magic version" below.


All queries presented in other answers so far operate with conditions that are not sargable: they cannot use an index and have to compute an expression for every single row in the base table to find matching rows. Doesn't matter much with small tables. Matters a lot with big tables.

Given the following simple table:

CREATE TABLE event (
event_id serial PRIMARY KEY
, event_date date
);

Query

Version 1. and 2. below can use a simple index of the form:

CREATE INDEX event_event_date_idx ON event(event_date);

But all of the following solutions are even faster without index.

1. Simple version

SELECT *
FROM (
SELECT ((current_date + d) - interval '1 year' * y)::date AS event_date
FROM generate_series( 0, 14) d
CROSS JOIN generate_series(13, 113) y
) x
JOIN event USING (event_date);

Subquery x computes all possible dates over a given range of years from a CROSS JOIN of two generate_series() calls. The selection is done with the final simple join.

2. Advanced version

WITH val AS (
SELECT extract(year FROM age(current_date + 14, min(event_date)))::int AS max_y
, extract(year FROM age(current_date, max(event_date)))::int AS min_y
FROM event
)
SELECT e.*
FROM (
SELECT ((current_date + d.d) - interval '1 year' * y.y)::date AS event_date
FROM generate_series(0, 14) d
,(SELECT generate_series(min_y, max_y) AS y FROM val) y
) x
JOIN event e USING (event_date);

Range of years is deduced from the table automatically - thereby minimizing generated years.

You could go one step further and distill a list of existing years if there are gaps.

Effectiveness co-depends on the distribution of dates. It's better for few years with many rows each.

Simple db<>fiddle to play with here

Old sqlfiddle

3. Black magic version

Create a simple SQL function to calculate an integer from the pattern 'MMDD':

CREATE FUNCTION f_mmdd(date) RETURNS int LANGUAGE sql IMMUTABLE PARALLEL SAFE AS
'SELECT (EXTRACT(month FROM $1) * 100 + EXTRACT(day FROM $1))::int';

I had to_char(time, 'MMDD') at first, but switched to the above expression which proved fastest in new tests on Postgres 9.6 and 10:

db<>fiddle here

It allows function inlining because EXTRACT(xyz FROM date) is implemented with the IMMUTABLE function date_part(text, date) internally. And it has to be IMMUTABLE to allow its use in the following essential multicolumn expression index:

CREATE INDEX event_mmdd_event_date_idx ON event(f_mmdd(event_date), event_date);

Multicolumn for a number of reasons:

Can help with ORDER BY or with selecting from given years. Read here. At almost no additional cost for the index. A date fits into the 4 bytes that would otherwise be lost to padding due to data alignment. Read here.

Also, since both index columns reference the same table column, no drawback with regard to H.O.T. updates. Read here.

Basic query:

SELECT *
FROM event e
WHERE f_mmdd(e.event_date) BETWEEN f_mmdd(current_date)
AND f_mmdd(current_date + 14);

One PL/pgSQL table function to rule them all

Fork to one of two queries to cover the turn of the year:

CREATE OR REPLACE FUNCTION f_anniversary(_the_date date = current_date, _days int = 14)
RETURNS SETOF event
LANGUAGE plpgsql AS
$func$
DECLARE
d int := f_mmdd($1);
d1 int := f_mmdd($1 + $2 - 1); -- fix off-by-1 from upper bound
BEGIN
IF d1 > d THEN
RETURN QUERY
SELECT *
FROM event e
WHERE f_mmdd(e.event_date) BETWEEN d AND d1
ORDER BY f_mmdd(e.event_date), e.event_date;

ELSE -- wrap around end of year
RETURN QUERY
SELECT *
FROM event e
WHERE f_mmdd(e.event_date) >= d OR
f_mmdd(e.event_date) <= d1
ORDER BY (f_mmdd(e.event_date) >= d) DESC, f_mmdd(e.event_date), event_date;
-- chronological across turn of the year
END IF;
END
$func$;

Call using defaults: 14 days beginning "today":

SELECT * FROM f_anniversary();

Call for 7 days beginning '2014-08-23':

SELECT * FROM f_anniversary(date '2014-08-23', 7);

db<>fiddle here - comparing EXPLAIN ANALYZE

"February 29"

When dealing with anniversaries or "birthdays", you need to define how to deal with the special case "February 29" in leap years.

When testing for ranges of dates, Feb 29 is usually included automatically, even if the current year is not a leap year. The range of days is extended by 1 retroactively when it covers this day.

On the other hand, if the current year is a leap year, and you want to look for 15 days, you may end up getting results for 14 days in leap years if your data is from non-leap years.

Say, Bob is born on the 29th of February:

My query 1. and 2. include February 29 only in leap years. Bob has birthday only every ~ 4 years.

My query 3. includes February 29 in the range. Bob has birthday every year.

There is no magical solution. You have to define what you want for every case.

Test

To substantiate my point I ran an extensive test with all the presented solutions. I adapted each of the queries to the given table and to yield identical results without ORDER BY.

The good news: all of them are correct and yield the same result - except for Gordon's query that had syntax errors, and @wildplasser's query that fails when the year wraps around (easy to fix).

Insert 108000 rows with random dates from the 20th century, which is similar to a table of living people (13 or older).

INSERT INTO  event (event_date)
SELECT '2000-1-1'::date - (random() * 36525)::int
FROM generate_series (1, 108000);

Delete ~ 8 % to create some dead tuples and make the table more "real life".

DELETE FROM event WHERE random() < 0.08;
ANALYZE event;

My test case had 99289 rows, 4012 hits.

C - Catcall

WITH anniversaries as (
SELECT event_id, event_date
,(event_date + (n || ' years')::interval)::date anniversary
FROM event, generate_series(13, 113) n
)
SELECT event_id, event_date -- count(*) --
FROM anniversaries
WHERE anniversary BETWEEN current_date AND current_date + interval '14' day;

C1 - Catcall's idea rewritten

Aside from minor optimizations, the major difference is to add only the exact amount of years date_trunc('year', age(current_date + 14, event_date)) to get this year's anniversary, which avoids the need for a CTE altogether:

SELECT event_id, event_date
FROM event
WHERE (event_date + date_trunc('year', age(current_date + 14, event_date)))::date
BETWEEN current_date AND current_date + 14;

D - Daniel

SELECT *   -- count(*)   -- 
FROM event
WHERE extract(month FROM age(current_date + 14, event_date)) = 0
AND extract(day FROM age(current_date + 14, event_date)) <= 14;

E1 - Erwin 1

See "1. Simple version" above.

E2 - Erwin 2

See "2. Advanced version" above.

E3 - Erwin 3

See "3. Black magic version" above.

G - Gordon

SELECT * -- count(*)   
FROM (SELECT *, to_char(event_date, 'MM-DD') AS mmdd FROM event) e
WHERE to_date(to_char(now(), 'YYYY') || '-'
|| (CASE WHEN mmdd = '02-29' THEN '02-28' ELSE mmdd END)
,'YYYY-MM-DD') BETWEEN date(now()) and date(now()) + 14;

H - a_horse_with_no_name

WITH upcoming as (
SELECT event_id, event_date
,CASE
WHEN date_trunc('year', age(event_date)) = age(event_date)
THEN current_date
ELSE cast(event_date + ((extract(year FROM age(event_date)) + 1)
* interval '1' year) AS date)
END AS next_event
FROM event
)
SELECT event_id, event_date
FROM upcoming
WHERE next_event - current_date <= 14;

W - wildplasser

CREATE OR REPLACE FUNCTION this_years_birthday(_dut date)
RETURNS date
LANGUAGE plpgsql AS
$func$
DECLARE
ret date;
BEGIN
ret := date_trunc('year' , current_timestamp)
+ (date_trunc('day' , _dut)
- date_trunc('year' , _dut));
RETURN ret;
END
$func$;

Simplified to return the same as all the others:

SELECT *
FROM event e
WHERE this_years_birthday( e.event_date::date )
BETWEEN current_date
AND current_date + '2weeks'::interval;

W1 - wildplasser's query rewritten

The above suffers from a number of inefficient details (beyond the scope of this already sizable post). The rewritten version is much faster:

CREATE OR REPLACE FUNCTION this_years_birthday(_dut INOUT date)
LANGUAGE sql AS
$func$
SELECT (date_trunc('year', now()) + ($1 - date_trunc('year', $1)))::date
$func$;

SELECT *
FROM event e
WHERE this_years_birthday(e.event_date) BETWEEN current_date
AND (current_date + 14);

Test results

I ran this test with a temporary table on PostgreSQL 9.1.7.
Results were gathered with EXPLAIN ANALYZE, best of 5.

Results


Without index
C: Total runtime: 76714.723 ms
C1: Total runtime: 307.987 ms -- !
D: Total runtime: 325.549 ms
E1: Total runtime: 253.671 ms -- !
E2: Total runtime: 484.698 ms -- min() & max() expensive without index
E3: Total runtime: 213.805 ms -- !
G: Total runtime: 984.788 ms
H: Total runtime: 977.297 ms
W: Total runtime: 2668.092 ms
W1: Total runtime: 596.849 ms -- !

With index
E1: Total runtime: 37.939 ms --!!
E2: Total runtime: 38.097 ms --!!

With index on expression
E3: Total runtime: 11.837 ms --!!

All other queries perform the same with or without index because they use non-sargable expressions.

Conclusion

  • So far, @Daniel's query was the fastest.

  • @wildplassers (rewritten) approach performs acceptably, too.

  • @Catcall's version is something like the reverse approach of mine. Performance gets out of hand quickly with bigger tables.

    The rewritten version performs pretty well, though. The expression I use is something like a simpler version of @wildplassser's this_years_birthday() function.

  • My "simple version" is faster even without index, because it needs fewer computations.

  • With index, the "advanced version" is about as fast as the "simple version", because min() and max() become very cheap with an index. Both are substantially faster than the rest which cannot use the index.

  • My "black magic version" is fastest with or without index. And it is very simple to call.

    The updated version (after the benchmark) is a bit faster, yet.

  • With a real life table an index will make even greater difference. More columns make the table bigger, and sequential scan more expensive, while the index size stays the same.

Calculate days between two dates ignoring year

So, the "basic" concept is, you want to take your list of dates and change the year to match this year. The "catch" is, if the resulting date is before today, you should increment the year by one, so that if it's December now, you will catch all the anniversaries which occur in January.

Maybe something like...

LocalDate now = LocalDate.now();
int year = now.getYear();
List<LocalDate> dates = ...;
List<LocalDate> adjusted = new ArrayList<>(10);
for (LocalDate date : dates) {
LocalDate warped = date.withYear(year);
if (warped.isBefore(now)) {
warped = warped.withYear(year + 1);
}
adjusted.add(warped);
}

Then you would simply check to see if the dates fall within your required range...

LocalDate limit = now.plusDays(30);
for (LocalDate date : adjusted) {
if ((date.isAfter(now) || date.isEqual(now)) && (date.isBefore(limit) || date.isEqual(limit))) {
System.out.println("~~ " + date);
}
}

So, with same pseudo, randomly generated date, I can get a result which looks something like...

Input date 2019-04-19
+---------------+---------------+--------------+
| Original Date | Adjusted Date | Within range |
+---------------+---------------+--------------+
| 1996-04-13 | 2020-04-13 | |
| 1986-04-24 | 2019-04-24 | X |
| 1989-04-23 | 2019-04-23 | X |
| 1960-05-11 | 2019-05-11 | X |
| 1986-05-18 | 2019-05-18 | X |
| 1984-04-06 | 2020-04-06 | |
| 1997-05-29 | 2019-05-29 | |
| 2008-03-31 | 2020-03-31 | |
| 2014-04-18 | 2020-04-18 | |
| 1982-04-23 | 2019-04-23 | X |
+---------------+---------------+--------------+

And if we change the anchor date to something like 2019-12-20, it could generate something like...

+---------------+---------------+--------------+
| Original Date | Adjusted Date | Within range |
+---------------+---------------+--------------+
| 2001-12-16 | 2020-12-16 | |
| 2005-12-28 | 2019-12-28 | X |
| 1988-12-31 | 2019-12-31 | X |
| 1989-11-13 | 2020-11-13 | |
| 1976-11-13 | 2020-11-13 | |
| 1991-01-09 | 2020-01-09 | X |
| 1963-11-04 | 2020-11-04 | |
| 2001-11-02 | 2020-11-02 | |
| 1980-01-11 | 2020-01-11 | X |
| 1979-11-17 | 2020-11-17 | |
+---------------+---------------+--------------+

So it's capturing the dates which land in next year.

nb: I randomly generate my test date to be within +/- one month of the anchor date so I would get better test data.

How to aggregate data from multiple years on MM-DD, ignoring year

Basically, to cut off the year, to_char(time, 'MMDD') like you already tried does the job. You just forgot to also apply it to the timestamps generated with generate_series()before joining. And some other minor details.

To simplify and for performance and convenience I suggest this simple function to calculate an integer from the pattern 'MMDD' of a given timestamp.

CREATE FUNCTION f_mmdd(date) RETURNS int LANGUAGE sql IMMUTABLE AS
'SELECT (EXTRACT(month FROM $1) * 100 + EXTRACT(day FROM $1))::int';

I used to_char(time, 'MMDD') at first, but switched to the above expression that turned out to be fastest in various tests.

db<>fiddle here

It can be used in expression indexes since it's defined IMMUTABLE. And it still allows function inlining because it only uses EXTRACT (xyz FROM date) - which is implemented with the IMMUTABLE function date_part(text, date) internally. (Note that datepart(text, timestamptz) is only STABLE).

Then this kind of query does the job:

SELECT d.mmdd, COALESCE(ct.ct, 0) AS total_count
FROM (
SELECT f_mmdd(d::date) AS mmdd -- ignoring the year
FROM generate_series(timestamp '2018-01-01' -- any dummy year
, timestamp '2018-12-31'
, interval '1 day') d
) d
LEFT JOIN (
SELECT f_mmdd(time::date) AS mmdd, count(*) AS ct
FROM counties c
JOIN ltg_data d ON ST_contains(c.the_geom, d.ltg_geom)
WHERE cwa = 'MFR'
GROUP BY 1
) ct USING (mmdd)
ORDER BY 1;

Since time (I would use a different column name) is data type timestamptz the cast time::date depends on the time zone setting of your current session. ("Days" are defined by the time zone you are in.) To get immutable (but slower) results use the AT TIME ZONE construct with a time zone name like:

SELECT f_mmdd((time AT TIME ZONE 'Europe/Vienna')::date) ...

Details:

  • Ignoring time zones altogether in Rails and PostgreSQL

Format mmdd any way you like for display.

The cast to integer is optional for the purpose of this particular query. But since you plan to do all kinds of queries, you'll end up wanting an index on the expression:

CREATE INDEX ltg_data_mmdd_idx ON event(f_mmdd(time));

(Not needed for this query.)

integer is a bit faster for this purpose.
And you need the (otherwise optional) function wrapper for this since to_char() is only defined STABLE, but we need IMMUTABLE for the index. The updated expression (EXTRACT(month FROM $1) * 100 + EXTRACT(day FROM $1))::int is IMMUTABLE, but the function wrapper is still convenient.

Related:

  • How do you do date math that ignores the year?
  • Generating time series between two dates in PostgreSQL

Find out if date is between two dates, ignoring year

The problem in your example is that (in the same year) the upper bounding date is before the lower bound. In that case, Any date less than the upper bound (Jan 1 - Mar 14) or greater than the lower bound (Nov 23 - Dec 31) falls between the two.

<?php
$upperBound = new DateTime("Mar 15");
$lowerBound = new DateTime("Nov 22");
$checkDate = new DateTime("Feb 1");

if ($lowerBound < $upperBound) {
$between = $lowerBound < $checkDate && $checkDate < $upperBound;
} else {
$between = $checkDate < $upperBound || $checkDate > $lowerBound;
}
var_dump($between);
?>

Displays:
boolean true

Edit

If the date you want to check is "Feb 29" and the current year is not a leap year, then DateTime interprets it as "Mar 1".

To check if a date falls between two dates, inclusively, use:

if ($lowerBound < $upperBound) {
$between = $lowerBound <= $checkDate && $checkDate <= $upperBound;
} else {
$between = $checkDate <= $upperBound || $checkDate >= $lowerBound;
}

Calculate days between dates without extra dates in leap years

You are wrong with with this statement:

So, for example between 1st Jan. 2012 and 1st Jan. 2013 should result 365 instead of 366.

JodaTime returns 366 in this situation, tested.

Solution without JodaTime

GregorianCalendar start = new GregorianCalendar(2012, 0, 1);
GregorianCalendar end = new GregorianCalendar(2013, 0, 1);
int days = (end.get(Calendar.YEAR) - start.get(Calendar.YEAR)) * 365 +
(end.get(Calendar.DAY_OF_YEAR) - start.get(Calendar.DAY_OF_YEAR));
System.out.println(days);

returns 365.

Date algorithm - Add days without considering leap years

Here is an extension method that works. Will also work if you're adding or subtracting enough days to span multiple leap years.

    public static DateTime AddDaysWithoutLeapYear(this DateTime input, int days)
{
var output = input;

if (days != 0)
{
var increment = days > 0 ? 1 : -1; //this will be used to increment or decrement the date.
var daysAbs = Math.Abs(days); //get the absolute value of days to add
var daysAdded = 0; // save the number of days added here
while (daysAdded < daysAbs)
{
output = output.AddDays(increment);
if (!(output.Month == 2 && output.Day == 29)) //don't increment the days added if it is a leap year day
{
daysAdded++;
}
}
}
return output;
}


Related Topics



Leave a reply



Submit