PEP 802: Display Syntax for the Empty Set

For your consideration: PEP 802: Display Syntax for the Empty Set.

From the Abstract:

We propose a new notation, {/}, to construct and represent the empty set.
This is modelled after the corresponding mathematical symbol ‘\emptyset’.

This complements the existing notation for empty tuples, lists, and
dictionaries, which use (), [], and {} respectively.

A

28 Likes

Here’s a diff when I added this feature to Frozenset literals and comprehensions :
https://github.com/nineteendo/cpython/compare/bf83dffab33c4c070bd3e0e4cd12a12e3942252c..0cc9de93a7ca5772afafcb525015f753ae5116b0

Sidenote: I didn’t think it was worth the churn to replace the 160 occurences of set() with {/} across the standard library.

5 Likes

Every attempt to introduce a new syntax for empty set should show why it is better than {*()} (or {*[]}, or {*{}} if you prefer). The latter works in all maintained versions. It is not widely used because set() is good enough. There is no problem that the new syntax would solve, but {*()} would not.

22 Likes

I’m currently oscillating between ±0.

I like having a notation for the empty set that doesn’t require a name lookup. I assume most IDEs/linters would yell at me for shadowing set anyway, so I imagine that the real benefit here is performance (which I would expect to be a small change), and maybe avoiding some cognitive load when looking at code for a first time since I don’t have to worry about whether the name set is defined more locally or not.

I’m skeptical of the claim that this is easier for beginners to learn. I think it has many of the same drawbacks as set(), in that {} remains the natural way for someone to want to write an empty set; so I think it has the same pitfalls as roughly any other syntax. For example, this wouldn’t allow us to remove " An empty set cannot be constructed with {}; this literal constructs an empty dictionary" from the language reference; we would just have to rephrase it.

I think {/} mirroring ∅ is a nice mnemonic (in addition to being kind of fun), but one way that it makes teaching this harder is that it’s possible to read into this syntax the idea that / implies emptiness in a more general sense, e.g., that [/] or (/) should be ways of making empty lists and tuples. This is a new kind of asymmetry that needs to be explained. Plus, (/) is even closer to ∅, so maybe that mnemonic has its problems as well.

13 Likes

I think one reason why {/} is better than {*()} is the number of characters. {*()} has the same number of keystrokes as set(), so it’s not necessarily better IMO. However, for me, typing {/} is as annoying as writing set(), so I would honestly be fine with keeping the status quo. In addition, I think using {/} could be visually confusing depending on the background color and the font (the slash may not necessarily be properly visible). Similarly, as a mathematician myself, I feel that (/) expresses way better the empty set (visually speaking) so overall I’m rather -0.5. At least, I won’t be using this new symbol I think.

11 Likes

I attempted to do so in the PEP, perhaps the section should be expanded. {*()} requires knowing several concepts at the same time, which can be less helpful for beginners – we should try and avoid saying “use this even if you don’t understand how it works or what it does”. Unpacking a tuple into a set literal was a side-effect of PEP 448, rather than an intentional and deliberate attempt to design a syntax that is easy to teach, understand, and explain.

I think it would be a misstep to promote {*()} as the syntax for an empty set for these reasons.

A

24 Likes

Thank you for the reasoned feedback.

I’ve carefully avoided discussing performance, I don’t think it is a motivator for this change. Should we be inclined to do so, we could optimise {*()} or similar patterns into the BUILD_SET opcode today, without a PEP.

I think an improvement is that we would be able to make a statement in positive terms. We currently introduce set displays by saying “A set display is denoted by curly braces and distinguishable from dictionary displays by the lack of colons separating keys and values:”. We would be able to go on to say “An empty set is denoted by {/}, to distinguish it from the empty dictionary”, which follows on better in that section.

You make a reasonable point. Notably, this is why I rejected the {,} syntax. However, I think it’s surmountable – beginners need to learn other ‘special’ things, like that you write 2**3 instead of 2^3, or that ** means a different thing in function signatures. Beginners make mistakes all the time, so we could also add “did you mean” error messages for [/] and (/) to the parser to remind users that this is illegal syntax.

A

7 Likes

I’m +1 mostly for the fun factor. I think {/} is just plain fun to use.

12 Likes

+1

And to be specific, I’m indeed +1 to the {/} syntax as proposed, and -1 to all the rejected syntaxes.

A few minor points of feedback:

I notice that the Rejected Ideas section doesn’t currently list (/) among its extensive list of what other syntaxes were rejected and why - should this be added? I’m sure I’ve seen some posts in support of (/) rather than {/} for the empty set because (/) looks “more like” … but the argument I’ve seen (and personally also agree with) for {/} rather than (/) is that non-empty sets are written with braces, not parentheses.

I also didn’t see any discussion of whitespace in the PEP - this post for example asks the question of whether it’s better to parse {/} as a single token (and therefore forbid variants with whitespace like { / }), or whether to parse it as three separate tokens (and therefore allow variants with whitespace). The PEP does give the grammar, from which it can be seen that the second option has been chosen (and, again, I agree with this decision) - but it’d be nice if it stated why this form was chosen (perhaps another Rejected Ideas section for “Parse {/} as a single token and forbid whitespace”?).

As a final nitpick:

beginners may not know how to recover the set type if they have overriden the name: techniques to do so (e.g. type({1}-{1})) are not immediately obvious

Surely the example here could be simplified to type({1}), as that also produces set.

3 Likes

I’m -1 on this. I would personally never use the new notation, as I think that set() is perfectly fine - it’s clear and explicit, and sets are not a common enough data structure that I feel the lack of a literal notation for an empty set is significant.

On the usability front, at least on my (UK) keyboard, set() is very easy to type (a word, followed by two (shifted) punctuation chracters). Whereas {/} is an awkward combination of shifted, unshifted, then shifted punctuation characters.

My big fear is that if this PEP were accepted, there would be a well-meaning tendency to (socially, at least) discourage using set() to create an empty set. PRs for projects changing set() to {/}, new linter rules “helpfully” suggesting that uses of set() get changed to {/}, etc. This creates churn, especially for people who are happy with the current spelling, and who could end up having to fight to retain the status quo.

If the Python community had more of a “live and let live” attitude towards people not adopting new syntax, I’d probably just be -0 on this. But sadly, I don’t think the community is like that any more :slightly_frowning_face:

48 Likes

I think I might be the progenitor of the {/} syntax in the previous discussions (other people probably thought of it too at some point!) and it’s fun to see the idea actually be brought up for an eventual decision. While I’m fond of the syntax I also understand the apprehension towards this, not based on the exact spelling (people will get used to it), but in the lines of if it’s really necessary since {*()} exists which, could be, or maybe already is, special cased in the compiler. Overall I don’t write enough sets in hot loops (or much Python at all recently) for this syntax to be relevant to me directly. Still, I think it would be good for the SC to make a ruling on this so even if it’s not accepted we can just point to the decision in future discussions.

5 Likes

To me, an empty set being written as {/} is even less consistent than there being no syntax for it at all. To anyone who doesn’t recognise , the obvious extrapolation from {} vs {/} is that the / is what makes it a set instead of a dict so a non empty set should be written as {/1, 2, 3} or possibly {1, 2, 3/} or maybe {/, 1, 2, 3}…? It doesn’t actually solve the problem that a new user needs to be explicitly told how to write both an empty set and a non empty set.

To help reinforce this, we will update the documentation to use {/} instead of set(), including the tutorial, standard libary modules, and the Python Language Reference.

This sounds way too soon. If a newish user installs Python on Ubuntu via apt-get or, for whatever other reason [1], doesn’t take the latest version of Python then following the official tutorials will lead to nondescript syntax errors.


  1. IT managed versions, some other distribution of Python, offhand suggestion from colleague ↩︎

20 Likes

I’ve updated the PEP following various pieces of feedback, and have added a link to a draft reference implementation.

Added.

I’ve added discussion of this and a reference to @jb2170’s post.

Excellent point.

I think this is a reasonable position, a new proposal must always justify changing the status quo. I hope to have done that, but the SC would be entirely within their rights to reject the PEP soley on these grounds.

I find myself using sets quite a lot in throwaway scripts, where it is useful to know that a collection is unique, and they have the benefit of faster membership testing.

I agree this is a problem. For example, at one point Ruff changed all my isintance() calls to use A | B syntax instead of tuples, and now I think it has changed back to prefer tuples. There’s not much I can do as a PEP author about the general social contract, but I have added a brief paragraph about this in the Backwards Compatibility section.

4 Likes

Hopefully users will read the documentation for the version of Python that they are using, either via the documentation provided by their package manager or by selecting the correct version on d.p.o.

This argument in general can be applied to any new syntax feature. I think it is more confusing to introduce a new feature and then entirely hide it in the documentation. I would, though, expect to have the appropriate versionadded or versionchanged directives so that the documentation does tell users that this is new. Users following the tutorial would then at least be aware that this has changed, and hopefully would have a pointer to what to do if they hit an error.

A

2 Likes

You could also remove the part of the proposal that says:

To help reinforce this, we will update the documentation to use {/} instead of set(), including the tutorial, standard libary modules, and the Python Language Reference.

Continue using set(), and document that {/} is a valid alternative. I know you won’t like this suggestion, but it’s definitely something you could do to reduce the risk that set() is seen as “no longer acceptable”. You could also not change the repr of set() - again, that reinforces the idea of set() still being an acceptable spelling.

Of course, you could argue that this is just changing the PEP to not propose a change at all, but it seems to me that the key thing people want is a “literal” spelling for the empty set, so adding {/} as a new alternative, but not otherwise changing anything, is a valid way of achieving that goal.

6 Likes

And as we should all be very aware of, sentences beginning “Hopefully” can often have that word replaced with “It’s unlikely that” and still be true.

Even in my own personal usage, I don’t go and select a specific version on d.p.o unless I am going out of my way to find a difference. I just use the default version (the latest stable) and presume upon it being “mostly correct”, which it usually is. It helps that, by being highly active here in the other d.p.o, I have a good chance of having some familiarity with what’s changing (or at least what’s in a state of flux; anything to do with typing, I won’t assume any level of correctness, and will check versions) - but for someone who isn’t, are they going to check version? I haven’t seen people doing it.

However, this is something that’s very straight-forward to catch. If you use {/} in any current version of Python, you won’t get obscure incorrect behaviour - you’ll get a clear and immediate SyntaxError. So on that basis, it’s going to be less problematic for people to get it wrong.

I’m still -0 on adding this; I very seldom create empty sets (setifying a list isn’t affected by this proposal), and for a number of reasons of ergonomics, I’m actually much more likely to use a dictionary than a set, even if all the values are just the integer 1. Maybe that could be improved in the future, but at the moment, dictionaries get a number of significant advantages over sets (including retaining insertion order, and being able to be mutated using subscript syntax rather than a method), so I have found myself stretching to come up with an excuse to have values associated with my keys rather than use a set.

7 Likes

Oh, the FLUFL is totally bringing back the <> inequality operator.

14 Likes

You’ve inspired an entire section in the PEP!

A

9 Likes

Also, I know all relevant concepts and would still not think of this as a way to express an empty set. It’s not very obvious to me, but I’m also not Dutch.

5 Likes

You may use performance as another argument. Can you run a microbenchmark comparing {/} and set() performance?

3 Likes