#16 correct timezone usage in Postgres
Closed: Fixed None Opened 9 years ago by jsedlak.

There are several problems with timezone usage in resultsdb and PostgreSQL that cause bad results on datetime-based queries (e. g. since). Problem is that Python function datetime.datetime.utcnow() returns UTC datetime, but it is "timezone-unaware" (it doesn't have timezone set). PostgreSQL and psycopg2 handles timezone in datetime objects so that when it is missing timezone info, it thinks that it's in localtime. resultsdb uses timestamp without time zone ([[ http://docs.sqlalchemy.org/en/rel_0_9/core/types.html#sqlalchemy.types.DateTime | see documentation ]]) type for saving datetimes. When you make query, it is parsed by ISO 8601 parser that mades "timezone-aware" datetime object. Psycopg2 (or Postgres) then converts it into localtime using timezone info and when it makes query, it compares it with timestamp without time zone (that is, in fact, in UTC).

Best solution for this problem would be to implement our own function utcnow() that returns timezone-aware datetime object && use timestamp with time zone in Postgres, but for now, stripping all timezone info from every since query (in parse_since) would be sufficient.


All of our machines should have localtime in UTC (if they don't let me know and I can fix that), does that help mitigate the issue some in practice (even if it's not the best long term solution)?

If I'm understanding you correctly, when doing datetime-based queries against resultsdb, part of the process will strip off any timezone data and mis-adjust the incoming datetime object, potentially skewing the datetime object by whatever the utc offset of the resultsdb server is?

My understanding of your proposed solution is:
change the resultsdb schema to use timezones for datetime columns
write a timezone-aware datetime.datetime.utcnow() wrapper in resultsdb and use that instead of plan utcnow() when handling datetime-related queries

Does that match what you were getting at?

If all the machines are set to run in UTC, then the issue is mitigated.

The problem is not that the process would be stripping timezones on queries, but quite the opposite - the database is now not storing timezone info in the respective columns, so all the data is considered to be localtime (note, that UTC the value stored in database is UTC time, since it is the product of utcnow). Rresultsdb query calls are then filled with datetime objects containing UTC tzinfo, and either the engine or the database itself then converts the provided datetime objects from UTC to localtime.

Stripping the tzinfo from the datetime object passed to queries is a workaround (and quite probable not that bad solution eihter) - since the data in the database are in UTC already, and we accept UTC time for the since parameter, stripping the tzinfo results in the anticipated behaviour.

Other solution, that might be "cleaner", is what you said - tzinfo aware utcnow + timestamp with timezone in the database.

Login to comment on this ticket.

Metadata