Proposal for date literals of form 1999_01_01

I work with a codebase with a lot of date fields similar to:

    orderDetails = {
       "deliveryDate":  datetime.date(2023, 12, 31),
       "paymentDate" :  datetime.date(2024, 1, 5),
    }

This is very verbose.
One could make it less verbose by switching to “dt.date(2023, 12, 31)”, but this still depends on presence of “import datetime as dt” above.

Instead, I propose to allow date literals to be declared using pattern “\d\d\d\d_\d\d_\d\d”, for example 1999_01_01:

   deliveryDate = 2023_12_31  # proposed. Would evaluate to datetime.date(2023,12,31)
   deliveryDate = 2023_12_32  # proposed - would throw "SyntaxError: day is out of range for month"

P.S. This new syntax would override PEP515 for this exact pattern.

References:

  1. ISO 8601 (ISO - ISO 8601 — Date and time format).
  2. PEP515 allows"underscored literals"
  3. KDB vector database allows date literals to be declared using “dot syntax” - // link: Data types | Basics | kdb+ and q documentation - Kdb+ and q documentation
    deliveryDate = 2023-12-31   # ISO-like syntax unfortunately evaluates to 1980
    deliveryDate = 2023_12_31  # PEP515 syntax, evaluates to integer 20231231
    deliveryDate: 2023.12.31     // KDB date literal syntax  

Such a backwards incompatibility is a non-starter.

5 Likes

Sounds like a great idea for a short helper function!

def date(n):
    return datetime.date(n % 10000, (n // 100) % 100, n % 100)

Obviously this won’t know where you placed your underscores, but as long as you stick to the strict format you described, this will work.

1 Like

Even less verbose, and works now.

import datetime.date as D
date = D(2023, 12, 31)
2 Likes

If one insists, e.g. from datetime import date as d makes it just d(2023, 12, 31). However, it is part of Python philosophy that “explicit is better than implicit”.

More to the point: large scale, real-world programs don’t hard-code large amounts of data. Choose or determine an appropriate file format, read data from disk, and create data structures that way. Or use a database system.

Not happening. This kind of backward compatibility break is an incredibly large cost, and new features “start at -100” anyway.

2 Likes

But they can and SHOULD hard-code what is appropriate.

Source code literals are vital.

1 Like

Thanks Karl. My apps actually communicate via JSON.
Sometimes megabytes of JSON, full of dates.

People have discussed how to encode dates in JSON, for example here: javascript - What is the "right" JSON date format? - Stack Overflow - the consensus is to encode dates as string, and convert those strings to dates later.

I wanted to see if other people feel that the current “dates as string” or “dates as functions” encoding can be improved on, KDB style.

P.S. There is an upcoming format “RSON” - Rust object notation - GitHub - rson-rs/rson: Rust Object Notation.
Perhaps I can pitch my idea to those guys, and use RSON to encode data more cleanly.

2023-12-31 already has a meaning, as does 2023_12_31, which leaves 2023.12.31.

The question is whether there’s sufficient need for it.

Even if we decide that this is needed (I don’t really think so) it doesn’t account for the different ordering different people may use.

YYYY-MM-DD
MM-DD-YYYY
DD-MM-YYYY
...

I’d say just use a datetime

1 Like

One of those is an abomination that should never exist. The other two would be viable on their own, but only one of them matches the argument order of datetime.date, so there really isn’t much to choose from here.

1 Like

Which one is the abomination? I’ve seen all 3 multiple times.

Month-Day-Year, which lacks the logic of sequencing. Yes, we see it all the time, but only in America. The rest of the world uses a sane ordering - often D-M-Y when humans are involved, but Y-M-D has a few advantages over it in a computing context (inherent sortability, the ease of extending it to timestamps by adding -H-M-S and optionally fractional seconds, etc). I exaggerate a little by calling it an abomination, but only a little.

With nearly all systems of hierarchical identification, there’s some sort of well-defined order. Consider a street address. Here’s an Australian one:

Fred Nurk
42 Nowhere Place
SUBURBIA NSW 2000
AUSTRALIA

Aside from the postcode (which is its own field and goes in four separate boxes over the right if the address is handwritten), this follows a strict sequencing: Person, House, Street, Locality, State, Country. It’s strictly ascending. This is a universal standard - here’s Australia Post’s recommendations, and here’s Czech Post’s recommendations, and you can probably find this for any other country.

What order do we write numbers in? Most significant digit to least significant. What if it’s in words? Would you say “Six million, three hundred six, and forty-seven thousand”? Not unless you deliberately want to confuse people! I’m sure you COULD describe an old pre-decimal price as “7 shillings, 6 pounds, and fivepence”, but you wouldn’t.

So why is it that dates get written month-day-year?

But, that aside: the main point is that year-month-day is the argument order of the date() constructor, and is thus the only meaningful option here.

3 Likes

My theory:
Year-Month-Day is the best one as it matches what is used for time, descending order. But, for most non official/government documents, you don’t need the year so people just wrote month-day. On official documents, people wrote month-day then realized they needed to include the year and were forced to write it on the right because there is no longer room on the left.
At least mmddyyyy is closer to yyyymmdd than the backwards ddmmyyyy is.

That isn’t really an advantage though. “Closer to” isn’t beneficial. With dates close to the present, we can fairly readily recognize the year (20190308 is a handful of years ago, not the deep and distant past), but distinguishing the month and day is harder. Maintaining consistency of either dd/mm/yyyy or yyyy/mm/dd (with or without delimiters) ensures that it’s unambiguous.

So basically, you’re saying that month-day-year is a mess that came about for the same reasons as the Y2K problem: people can’t be bothered fixing their documents. Yep, that sounds like government business to me.

Which one is the abomination? I’ve seen all 3 multiple times.

Try emailing an Excel sheet from the US to a customer outside the US.

If the sheet contains dates, half the dates formatted using these schemes will break - turn into integers.
For good measure, alignment will break also - the few good dates might be left-aligned, and the broken dates right-aligned.

Time for the world to embrace the KDB date literal …

I knew it wouldn’t be long before a really good example showed up. In today’s Error’d, you can see exactly what happens when you use American dates naively. And yes, they COULD use American dates in the display while having something else as the sort key, but that wouldn’t be “naively”, and that’s still a good indication of the problem (if you can’t sort by your data but have to sort by an alternate version of it).

1 Like