How to Create a Dates Table in Redshift

How do I create a dates table in Redshift?

In asking the question, I figured it out. Oops.

I started with a "facts" schema.

CREATE SCHEMA facts;

Run the following to start a numbers table:

create table facts.numbers
(
number int PRIMARY KEY
)
;

Use this to generate your number list. I used a million to get started

SELECT ',(' || generate_series(0,1000000,1) || ')'
;

Then copy-paste the numbers from your results in the query below, after VALUES:

INSERT INTO facts.numbers
VALUES
(0)
,(1)
,(2)
,(3)
,(4)
,(5)
,(6)
,(7)
,(8)
,(9)
-- etc

^ Make sure to remove the leading comma from the copy-pasted list of numbers

Once you have a numbers table, then you can generate a dates table (again, stealing code from elliot land http://elliot.land/post/building-a-date-dimension-table-in-redshift ) :

CREATE TABLE facts.dates (
"date_id" INTEGER NOT NULL PRIMARY KEY,

-- DATE
"full_date" DATE NOT NULL,

-- YEAR
"year_number" SMALLINT NOT NULL,
"year_week_number" SMALLINT NOT NULL,
"year_day_number" SMALLINT NOT NULL,

-- QUARTER
"qtr_number" SMALLINT NOT NULL,

-- MONTH
"month_number" SMALLINT NOT NULL,
"month_name" CHAR(9) NOT NULL,
"month_day_number" SMALLINT NOT NULL,

-- WEEK
"week_day_number" SMALLINT NOT NULL,

-- DAY
"day_name" CHAR(9) NOT NULL,
"day_is_weekday" SMALLINT NOT NULL,
"day_is_last_of_month" SMALLINT NOT NULL
) DISTSTYLE ALL SORTKEY (date_id)
;

INSERT INTO facts.dates
(
"date_id"
,"full_date"
,"year_number"
,"year_week_number"
,"year_day_number"

-- QUARTER
,"qtr_number"

-- MONTH
,"month_number"
,"month_name"
,"month_day_number"

-- WEEK
,"week_day_number"

-- DAY
,"day_name"
,"day_is_weekday"
,"day_is_last_of_month"
)
SELECT
cast(seq + 1 AS INTEGER) AS date_id,

-- DATE
datum AS full_date,

-- YEAR
cast(extract(YEAR FROM datum) AS SMALLINT) AS year_number,
cast(extract(WEEK FROM datum) AS SMALLINT) AS year_week_number,
cast(extract(DOY FROM datum) AS SMALLINT) AS year_day_number,

-- QUARTER
cast(to_char(datum, 'Q') AS SMALLINT) AS qtr_number,

-- MONTH
cast(extract(MONTH FROM datum) AS SMALLINT) AS month_number,
to_char(datum, 'Month') AS month_name,
cast(extract(DAY FROM datum) AS SMALLINT) AS month_day_number,

-- WEEK
cast(to_char(datum, 'D') AS SMALLINT) AS week_day_number,

-- DAY
to_char(datum, 'Day') AS day_name,
CASE WHEN to_char(datum, 'D') IN ('1', '7')
THEN 0
ELSE 1 END AS day_is_weekday,
CASE WHEN
extract(DAY FROM (datum + (1 - extract(DAY FROM datum)) :: INTEGER +
INTERVAL '1' MONTH) :: DATE -
INTERVAL '1' DAY) = extract(DAY FROM datum)
THEN 1
ELSE 0 END AS day_is_last_of_month
FROM
-- Generate days for 81 years starting from 2000.
(
SELECT
'2000-01-01' :: DATE + number AS datum,
number AS seq
FROM facts.numbers
WHERE number between 0 and 81 * 365 + 20
) DQ
ORDER BY 1;

^ Be sure to set the numbers at the end for the date range you need

How can you generate a date list from a range in Amazon Redshift?

a hack, but works:

use a table with many many rows, and a window function to generate the series

this works as long as you are generating a series that is smaller than the number of rows in the table you're using to generate the series

WITH x(dt) AS (SELECT '2016-01-01'::date)
SELECT
dateadd(
day,
COUNT(*) over(rows between unbounded preceding and current row) - 1,
dt)
FROM users, x
LIMIT 100

the initial date 2016-01-01 controls the start date, and the limit controls the number of days in the generated series.

Update: * Will only run on the leader node

Redshift has partial support for the generate_series function but unfortunately does not mention it in their documentation.

This will work and is the shortest & most legible way of generating a series of dates as of this date (2018-01-29):

SELECT ('2016-01-01'::date + x)::date 
FROM generate_series(1, 100, 1) x

create table with dates - sql

demo:db<>fiddle

WITH dates AS (
SELECT
date_trunc('month', CURRENT_DATE) AS first_day_of_month,
date_trunc('month', CURRENT_DATE) + interval '1 month -1 day' AS last_day_of_month
)
SELECT
generate_series(first_day_of_month, last_day_of_month, interval '1 day')::date
FROM dates
  • date_trunc() truncates a type date (or timestamp) to a certain date part. date_trunc('month', ...) removes all parts but year and month. All other parts are set to their lowest possible values. So, the day part is set to 1. That's why you get the first day of month with this.
  • adding a month returns the first of the next month, subtracting a day from this results in the last day of the current month.
  • Finally you can generate a date series with start and end date using the generate_series() function

Edit: Redshift does not support generate_series() with type date and timestamp but with integer. So, we need to create an integer series instead and adding the results to the first of the month:

db<>fiddle

WITH dates AS (
SELECT
date_trunc('month', CURRENT_DATE) AS first_day_of_month,
date_trunc('month', CURRENT_DATE) + interval '1 month -1 day' AS last_day_of_month
)
SELECT
first_day_of_month::date + gs
FROM
dates,
generate_series(
date_part('day', first_day_of_month)::int - 1,
date_part('day', last_day_of_month)::int - 1
) as gs

Date format in Redshift create table

The internal data format does not distinguish formatting, but you can use TO_DATE and TO_CHAR specify your preferred output format during INSERT or SELECT, respectively.

To modify the date format during SELECT:

SELECT TO_CHAR(DATES, 'YYYY/MM/DD'), * FROM data

To use the modified date format during INSERT:

INSERT INTO date (DATES /*... other columns here*/)
VALUES (TO_DATE('2016/12/01', 'YYYY/MM/DD' /* ... other values here*/)

Amazon Redshift: How to create a table containing time series

Unfortunately, generate_series isn't supported in Redshift.

My team is using a CTE like this to get a series of consecutive dates starting from a particular date:

with dates as (
select
(trunc(getdate()) + row_number() over (order by 1))::date as date
from
large_enough_table
limit
150
)

Then you can use it as:

select date
from dates
order by date
limit 5

And get:

2018-12-13
2018-12-14
...


Related Topics



Leave a reply



Submit