Add the ability to declare an empty dict as `{:}`

In my opinion, the fact that {} declares an empty dict is a major design flaw. Empty curly braces should only be associated with the empty set.

If a newcomer learns that [] declares an empty list, their intuition would be to do the same thing for a set. Instead, they have to use set(). This is very counter-intuitive.

I propose adding a new syntax for declaring an empty dict: {:}. I can’t think of any scenario where this already means something to the language.

Furthermore, we could then use the actual symbolism for an empty set: {}. I know that making this change would break the code that relies on empty dict declarations, but wouldn’t it be a simple fix? Just find-and-replace every instance of {} with {:}. The empty set declarations using set() would remain unaffected.

I don’t believe breaking virtually every Python project on Earth is feasible at this point of Python’s lifetime.

30 Likes

This has been suggested before, if you are serious about pursuing the idea you should search the archives both here and on the Python-Ideas mailing list.

There are three parts to this:

  1. Is it true that {} as an empty set is more intuitive than {} as an empty dict?
  2. Can we introduce {:} as syntax for an empty dict?
  3. Can we change {} so it creates a set instead of a dict?

Part 1 is plausible but not proven. It seems to me that dicts are far more important, and common, than sets.

Part 2 is certainly possible. We could introduce {:} as syntax for empty dicts.

But part 3 is where things get very hard. Python is a 30+ year old language with tens of thousands of users, tens of millions of lines of code, thousands of blog posts, Stackoverflow answers, tutorials etc. We have to take backwards compatibility really seriously. Changing the meaning of {} is not an easy thing to do, it is much harder than

That would change the meaning of strings containing “{}”, including f-strings and format strings. It would make hundreds of books, tutorials, blog posts etc obsolete and wrong, causing confusion to anyone reading them.

And who is going to do that find-and-replace? You? We are forcing thousands of developers to change their code, and I guarantee that some code will be missed.

Very young languages with only a few users can make radical changes at the drop of a hat. But for a mature language like Python, such a backwards-compatibility breaking change is a big deal.

We have a process for such changes, possibly involving a __future__ import. Even if we don’t do that, there would have to be a long depreciation period, probably three releases if not more, during which any attempt to use {} will give a DepreciationWarning.

With that amount of pain and breakage involved in the change-over, the benefit would need to be corresponding bigger. And frankly, the benefit is pretty small.

Sad to say, I think that this is a missed opportunity. We should have made this change as part of the Python 2 to 3 transition. But we didn’t, and now the benefit is probably too small for us to ever make that change worth the pain and effort.

9 Likes

This is brought up every now and again and I think virtually everybody agrees this is how it should have been designed. But, since sets were added long after dicts, this ship has sailed.

4 Likes

That isn’t how backward compatibility works.

While you’re right that, if set display syntax had existed from the beginning of Python’s history, {} would logically be a set and {:} could then be used for a dictionary, it’s way WAY too late to make that change now. Breaking every program that uses an empty dictionary is much much too big an issue for a simple nice-to-have.

1 Like

Nope, even then, the benefit would not have justified the massive breakage. It would have been extremely difficult to write 2/3 spanning code.

Not really any more so than the other changes (print, Unicode, true division, relative imports, etc), as most of the same strategies would have applied—__future__ import, 2to3, using set() and dict(), futures/six, futureize/modernize, etc, and it would be simpler for tooling or humans to adapt to than any of these. Given all of this stuff had to be dealt with anyway for all the other changes large and small, the marginal cost of adding this additional change would have been very small, probably enough to justify the long-term benefit IMO.

But of course, that ship has sailed, and I can’t see it ever happening again, not even in a hypothetical Python 3 → 4. Python is simply too widely used for the cost to ever be worth the benefit, and at this point in the adoption curve the cost is only ever going to increase, and the total long-term benefit decrease.

If, one day, Python is stored as AST trees rather than text, then it would be possible to display the AST for dict-display however the user wants.

print? As long as you make sure you do your formatting first and then call print with parentheses, the exact same code works fine on all versions of Python.

Unicode? Yes, that was a significant hassle, but as of 3.3 (I think), the u"..." syntax was supported, and the b"..." syntax was supported at some point in 2.x (I’m not sure when exactly), making it easier to write code that behaves the same way on both. At least syntactically.

True division? Technically an issue for quite a few types, but in practice, only an actual problem when dividing int by int, and the double-slash floor division syntax was added in Python 2.2 (had to look that one up!). So you could specify floor division compatibly for the better part of two decades before Python 2 was shut down, and true division could be achieved by casting one operand to float before dividing. Syntactically, no major issues, although bugs could easily slip in.

Relative imports? I’m not sure of the details as I never ran into problems myself, so I’ll have to leave that for someone else to answer. To what extent did this make it difficult to write spanning code?

For the most part, syntactically, spanning code wasn’t too hard to write. (The lack of u"..." in Python 3.0-3.2 was a pain point that then got fixed.) That would very much NOT be the case with a change to the meaning of {}. Yes, you could write code that never uses empty dict literals OR empty set literals, but that has other consequences (requiring a name lookup on dict or set, thus leaving you open to the risk of shadowing).

Backward-incompatible changes demand justification to match. Adding {:} wouldn’t be a major problem, but also isn’t very much benefit, so this proposal is basically useless without this massively breaking change.

That’s already possible, with some editor extensions rendering set() as and other changes. It’s a cool trick but has never really become mainstream.

1 Like

Right, but I’m talking about storing the AST, which would be much more than a cool trick. It would mean you wouldn’t need a style guide because every developer could see the code in their own style. You would see diffs in your own style, etc.

The problems with this probably outweigh the benefits today, but as minor inconveniences (like this one) pile up, it might one day be worth it.

So, basically, throw away all the information in the source code other than what’s executable, and save that? You can already do that on a per-project basis by just mandating the use of a code formatter. And quite frankly, I’d rather retain any clue as to a programmer’s intent, and that includes cases where formatting disagrees with functionality. Forcing the code to be reformatted just destroys information without actually achieving anything.

We’ve had automated code formatters for years. They have not solved any problems other than “how should we format our code?”.

It solves this issue, for one.

Kinda. Not really. It’s better solved by the aforementioned editor extensions that can also use empty set symbols and such.

I realise this isn’t what the discussion is about, but I like {*()} for the empty set. It somehow reminds me of ∅. Not that you should use that mess of symbols in code that others might read, when set() is perfectly readable.

3 Likes

It does have the very important benefit of NOT requiring a name lookup, thus ensuring that its meaning remains consistent even if the name set has been rebound. It could also be peephole-optimized; here’s the current disassembly:

>>> dis.dis(lambda: {*()})
  1           0 RESUME                   0
              2 BUILD_SET                0
              4 LOAD_CONST               1 (())
              6 SET_UPDATE               1
              8 RETURN_VALUE

Since that’s loading a constant in order to do a SET_UPDATE, the optimizer could recognize that this will have no effect, and remove those two operations - thus leaving us with just BUILD_SET 0, equivalent to empty list or dict initialization. (Empty tuple, of course, is just LOAD_CONST.)

So while I wouldn’t outright encourage this idiom, I also wouldn’t denigrate it.

6 Likes

What if we made {/} mean an empty set? Seems close enough to the mathematical notation and doesn’t interfere with any syntax that I can think of.

Not saying we need it, just thought it looked kinda nice.

7 Likes

Yeah, that was more of an afterthought. If you couldn’t tell, I don’t know anything about backwards compatibility.

I made a proposal with a modified grammar some years ago that would make {,} the empty set. If there was a real desire for this, a similar change could be made so that both {,} == set() and {:} == dict() such that {} could, eventually over a very long time, be removed.

One thing that actually changing the grammar unveiled was that once {,} becomes a legal literal, it becomes extremely easy to make (,) legal, which IMO is another wart on the language (parentheses don’t make tuples, commas do; oh, except for an empty tuple, then you must always use parentheses and no comma).

3 Likes

On that same “nice but unnecessary” note, seeing how / meaning “unpack nothing” somewhat mirrors * meaning “unpack something”, one could add it to all native collections (/), [/], {/} for consistency, throw in {:} for good measure, and keep the empty initialisers as they are.