Time Zone Storage in Data Type "Timestamp With Time Zone"

Time zone storage in data type timestamp with time zone

This is just a misunderstanding stemming from the somewhat misleading type name. The time zone itself is not stored at all. It just acts as offset to compute a UTC timestamp (input), which is actually stored. Or as decorator in the display of a timestamp according to the current or given time zone (output). That's all according to the SQL standard.

Just the point in time is stored, no zone information. That's why 64 bit of information is enough. The timestamp is displayed to the client according to the current time zone setting of the session.

Details:

  • Ignoring time zones altogether in Rails and PostgreSQL

Also, since Jon mentioned it, time with time zone is defined in the SQL standard and thus implemented in Postgres, but its use is discouraged:

time with time zone is defined by the SQL standard, but the definition
exhibits properties which lead to questionable usefulness.

It's an inherently ambiguous type that cannot deal with DST properly.

Difference between timestamps with/without time zone in PostgreSQL

The differences are covered at the PostgreSQL documentation for date/time types. Yes, the treatment of TIME or TIMESTAMP differs between one WITH TIME ZONE or WITHOUT TIME ZONE. It doesn't affect how the values are stored; it affects how they are interpreted.

The effects of time zones on these data types is covered specifically in the docs. The difference arises from what the system can reasonably know about the value:

  • With a time zone as part of the value, the value can be rendered as a local time in the client.

  • Without a time zone as part of the value, the obvious default time zone is UTC, so it is rendered for that time zone.

The behaviour differs depending on at least three factors:

  • The timezone setting in the client.
  • The data type (i.e. WITH TIME ZONE or WITHOUT TIME ZONE) of the value.
  • Whether the value is specified with a particular time zone.

Here are examples covering the combinations of those factors:

foo=> SET TIMEZONE TO 'Japan';
SET
foo=> SELECT '2011-01-01 00:00:00'::TIMESTAMP;
timestamp
---------------------
2011-01-01 00:00:00
(1 row)

foo=> SELECT '2011-01-01 00:00:00'::TIMESTAMP WITH TIME ZONE;
timestamptz
------------------------
2011-01-01 00:00:00+09
(1 row)

foo=> SELECT '2011-01-01 00:00:00+03'::TIMESTAMP;
timestamp
---------------------
2011-01-01 00:00:00
(1 row)

foo=> SELECT '2011-01-01 00:00:00+03'::TIMESTAMP WITH TIME ZONE;
timestamptz
------------------------
2011-01-01 06:00:00+09
(1 row)

foo=> SET TIMEZONE TO 'Australia/Melbourne';
SET
foo=> SELECT '2011-01-01 00:00:00'::TIMESTAMP;
timestamp
---------------------
2011-01-01 00:00:00
(1 row)

foo=> SELECT '2011-01-01 00:00:00'::TIMESTAMP WITH TIME ZONE;
timestamptz
------------------------
2011-01-01 00:00:00+11
(1 row)

foo=> SELECT '2011-01-01 00:00:00+03'::TIMESTAMP;
timestamp
---------------------
2011-01-01 00:00:00
(1 row)

foo=> SELECT '2011-01-01 00:00:00+03'::TIMESTAMP WITH TIME ZONE;
timestamptz
------------------------
2011-01-01 08:00:00+11
(1 row)

What timezone information does PostgreSQL store?

Both your assumptions are wrong:

PostgreSQL stores a timestamp with time zone as 8-byte integer that contains the offset from 2000-01-01 00:00:00 UTC in microseconds.

So it neither stores the time zone, nor is the precision 1 minute.

Upon conversion to a string, the timestamp is formatted according to the current setting of the timezone parameter.

So if you have to store the time zone separately if you need to remember it and use the AT TIME ZONE expression to convert the timestamp to the proper time zone.

You ask for documentation references. Part of that is here:

/*
* Timestamp represents absolute time.
[...]
* Timestamps, as well as the h/m/s fields of intervals, are stored as
* int64 values with units of microseconds. (Once upon a time they were
* double values with units of seconds.)

In the same file, you find

/* Julian-date equivalents of Day 0 in Unix and Postgres reckoning */
#define UNIX_EPOCH_JDATE 2440588 /* == date2j(1970, 1, 1) */
#define POSTGRES_EPOCH_JDATE 2451545 /* == date2j(2000, 1, 1) */

How does PostgreSQL store datetime types internally

Both timestamp with time zone and timestamp without time zone are stored as an 8 byte integer representing microseconds since midnight on Jan 1 2000.

The difference is that timestamp with time zone is interpreted as being in UTC and is converted according to the current setting of the timezone parameter on display.

The definitions are in src/include/datatype/timestamp.h:

typedef int64 Timestamp;
typedef int64 TimestampTz;

time without time zone is a 8 byte integer representing microseconds since midnight (a 4 byte integer wouldn't suffice). See src/include/utils/date.h:

typedef int64 TimeADT;

time with time zone has an additional 4 byte integer representing the time zone offset in seconds.

See src/include/utils/date.h:

typedef struct
{
TimeADT time; /* all time units other than months and years */
int32 zone; /* numeric time zone, in seconds */
} TimeTzADT;

Follow the documentation and avoid time with time zone:

The type time with time zone is defined by the SQL standard, but the definition exhibits properties which lead to questionable usefulness.

Convert timestamp without timezone into timestamp with timezone

You're looking for AT TIME ZONE, which is a little confusing but can do everything you need it to.

To convert a timestamp known to be UTC to a timestamptz:

SELECT '2021-06-24 11:00:00'::timestamp AT TIME ZONE 'UTC';
=> 2021-06-24 06:00:00-05

All timestamptz values are stored internally in Postgres in UTC, and do not retain the timezone they were inserted at, so they will then show up in whatever your connection timezone is. You can cast a timestamptz back to a timestamp in your desired zone by using AT TIME ZONE again (as Erwin pointed out below, AT TIME ZONE always switches between timestamp and timestamptz):

SELECT '2021-06-24 11:00:00-00'::timestamptz AT TIME ZONE 'UTC';
=> 2021-06-24 11:00:00

SELECT '2021-06-24 11:00:00-00'::timestamptz AT TIME ZONE 'America/Chicago';
=> 2021-06-24 06:00:00

Postgres timestamp and time zone for raw UTC values

Not a moment

As others said, TIMESTAMP WITHOUT TIME ZONE is the wrong data type.

That type holds only a date with time-of-day, for example noon on January 21st of 2021. But we cannot know if that means noon in Tokyo Japan, noon in Toulouse France, or noon in Toledo Ohio US — three very different moments, several hours apart from one another. So this type cannot represent a moment, is not a specific moment on the timeline.

The TIMESTAMP WITHOUT TIME ZONE type is good for three kinds of use-cases:

  • Representing multiple moments all known the same in their locality. For example, Acme Corp orders the manager at each factory in Delhi, Düsseldorf, and Detroit, to make an announcement in two days at their local time of noon.
  • Representing a date and time-of-day where the intended time zone is unknown. I consider this faulty data that should be rejected. But if you insist on writing it to the database, this type would be appropriate.
  • Booking future appointments where we want to keep the time-of-day even if those pesky politicians change the offset of the time zone(s) in their jurisdiction. These political changes happen surprisingly often. So book an appointment using two columns: TIMESTAMP WITHOUT TIME ZONE in one, and the name of the intended time zone in another. Time zones are named with Continent/Region format such as Africa/Tunis. At runtime, when you need a moment for calendaring, apply the time zone to the date and time to dynamically determine a moment according to the now-current time zone rules. In Noda Time, you would retrieve a LocalDateTime and time zone, to produce a ZonedDateTime for calendaring.

Moments

When you care about moments, specific points on the timeline, use TIMESTAMP WITH TIME ZONE.

In Postgres and many other databases, this type has a bit of a misnomer. The time zone is not actually saved with the date and time. Instead, upon submission to the database, any indicator of time zone or offset-from-UTC is used to adjust to UTC (an offset of zero hours-minutes-seconds). That UTC value is what Postgres writes to the database. And UTC is what Postgres always retrieves from the database.

Beware: Some middleware and tooling has the well-intentioned but very confusing anti-feature of dynamically applying a default time zone to the retrieved value, adjusting from UTC to that time zone. This creates the illusion of that time zone having been saved with the data. But, no, not so. A TIMESTAMP WITH TIME ZONE column stores only UTC values.

Use case examples:

  • Tracking the moment when a database record was created, modified, or deleted.
  • Logging for debugging or sysadmin work.
  • Tracking when a critical contract is signed or comes into effect.

Best practices with saving datetime & timezone info in database when data is dependant on datetime

Hugo's answer is mostly correct, but I'll add a few key points:

  • When you're storing the customer's time zone, do NOT store a numerical offset. As others have pointed out, the offset from UTC is only for a single point in time, and can easily change for DST and for other reasons. Instead, you should store a time zone identifier, preferably an IANA time zone identifier as a string, such as "America/Los_Angeles". Read more in the timezone tag wiki.

  • Your OrderDateTime field should absolutely represent the time in UTC. However, depending on your database platform, you have several choices for how to store this.

    • For example, if using Microsoft SQL Server, a good approach is to store the local time in a datetimeoffset column, which preserves the offset from UTC. Note that any index you create on that column will be based on the UTC equivalent, so you will get good query performance when doing your range query.

    • If using other database platforms, you may instead wish to store the UTC value in a timestamp field. Some databases also have timestamp with time zone, but understand that it doesn't mean it stores the time zone or offset, it just means that it can do conversions for you implicitly as you store and retrieve values. If you intend to always represent UTC, then often timestamp (without time zone) or just datetime is more appropriate.

  • Since either of the above methods will store a UTC time, you'll also need to consider how to perform operations that need an index of local time values. For example, you might need to create a daily report, based on the day of the user's time zone. For that, you'd need to group by the local date. If you try to compute that at query time from your UTC value, you'll end up scanning the entire table.

    A good approach to deal with this is to create a separate column for the local date (or perhaps even the local datetime depending on your needs, but not a datetimeoffset or timestamp). This could be a completely isolated column that you populate separately, or it could be a computed/calculated column based on your other column. Use this column in an index so you can filter or group by local date.

  • If you go for the computed-column approach, you'll need to know how to convert between time zones in the database. Some databases have a convert_tz function built-in that understands IANA time zone identifiers.

    If you're using Microsoft SQL Server, you can use the new AT TIME ZONE function in SQL 2016 and Azure SQL DB, but that only works with Microsoft time zone identifiers. To use IANA time zone identifiers, you'll need a third party solution, such as my SQL Server Time Zone Support project.

  • At query time, avoid using the BETWEEN statement. It is fully inclusive. It works ok for whole dates, but when you have time involved you're better off doing a half-open range query, such as:

    ... WHERE OrderDateTime >= @t1 AND OrderDateTime < @t2

    For example, if @t1 were the start of today, @t2 would be the start of tomorrow.

Regarding the scenario discussed in comments where the user's time zone has changed:

  • If you choose to calculate the local date in the database, the only scenario you need to worry about is if a location or business switches time zones without a "zone split" occurring. A zone split is when a new time zone identifier is introduced which covers the area that changed, including their old and new rules.

    For example, the latest zone added to the IANA tzdb at the time of writing this is America/Punta_Arenas, which was a zone split when the southern part of Chile decided to stay at UTC-3 when the rest of Chile (America/Santiago) went back to UTC-4 at the end of DST.

    However, if a minor locality on the border of two time zones decides to change which side they follow, and a zone split wasn't warranted, then you'd potentially be using the rules of their new time zone against their old data.

  • If you store the local date separately (computed in the application, not the DB), then you'll have no problems. The user changes their time zone to the new one, all old data is still intact, and new data is stored with the new time zone.

Should I use the datetime or timestamp data type in MySQL?

Timestamps in MySQL are generally used to track changes to records, and are often updated every time the record is changed. If you want to store a specific value you should use a datetime field.

If you meant that you want to decide between using a UNIX timestamp or a native MySQL datetime field, go with the native DATETIME format. You can do calculations within MySQL that way
("SELECT DATE_ADD(my_datetime, INTERVAL 1 DAY)") and it is simple to change the format of the value to a UNIX timestamp ("SELECT UNIX_TIMESTAMP(my_datetime)") when you query the record if you want to operate on it with PHP.

Easiest way to convert YYYYMMDD to the midnight with specific timezone on PostgreSQL?

Strictly speaking, your request is an oxymoron:

I want to get 2021-01-01 00:00:00.000000 in my TZ.

You show a timestamp (timestamp without time zone) literal, which is completely orthogonal to (and ignorant of) the concept of time zones.

But you want it "in my TZ", which would imply to a timestamptz (timestamp with time zone) value, where the corresponding literal includes a time offset like: 2021-01-01 00:00:00.000000+01.

Since the format YYYYMMDD is unambiguous ISO format, you can cast to date or timestamp directly, safely. A cast to timestamp assumes the time component 00:00 automatically. Produces your desired timestamp '2021-01-01 00:00'.

SELECT '20211203'::timestamp;

If you want the result type timestamp, we are done here.

If you want the result type timestamptz, there is a quick-and-dirty shortcut:

SELECT '20211203'::timestamptz;

The current time zone setting is assumed for the type cast. But this introduces a dependency on a runtime settings. Notoriously unreliable, only advisable for situations where you can be certain of the current setting ...

The sure and generally advisable way is to define the target time zone with the AT TIME ZONE construct explicitly. Say, your timezone is 'Europe/Vienna':

SELECT '20211203'::timestamp AT TIME ZONE 'Europe/Vienna';

Use a time zone name. Time zone abbreviations are treacherous for input conversion and may fail for daylight saving time (DST) or other bureaucratic nonsense. 'CET' (Central European Time) is appropriate for timestamps during "standard time". During DST periods, you'd have to use 'CEST' (Central European Summer Time).

DST is utter nonsense, but some countries, including the EU, still haven't managed to get rid of it.

db<>fiddle here - note that dbfiddle runs with time zone GB by default.

Now you have the timestamptz value representing the start of the day (00:00) in your given time zone. Don't be fooled by the display of timestamptz values. That's always adjusted to the current time zone setting, but it always represents that unique point in time, just with different ways to display it.

You do understand that the time zone itself is never stored in a timestamptz value, right? Even though timestamp with time zone sounds like it might. See:

  • Time zone storage in data type "timestamp with time zone"

To force a certain display use to_char() or some other functions to generate the desired string. There is a dedicated page Data Type Formatting Functions in the manual.

Related:

  • https://www.postgresql.org/docs/current/view-pg-timezone-names.html
  • Ignoring time zones altogether in Rails and PostgreSQL


Related Topics



Leave a reply



Submit