`frozenset` and `frozendict` comprehensions

Not a problem IMO. If you do that in multiple steps, you probably either (a) want the intermediate value, or (b) are deliberately defeating the optimization for testing purposes. Requiring it to be spelled in a particular way is no dfferent from literals/displays anyway.

Sadly, + does not read like immutable at all. It even reads like the opposite for me :frowning:

The syntax of dict / set oprators are even weirder: +{1, 2} | +{3, 4}

It would be also strange that -{} does nothing, which would be kinda expected from the math duality of + and -?

Moreover, it can mess up with some existing code, people might have custom dict / set subclasses that use unary + with __pos__. It also does not free us from a mental problem that people would need to think that +{} and d = {}; +d work differently in runtime.

2 Likes

This is not a problem, I agree. But just something that we would have to teach people (and LLMs) to write performant code.

We should try to keep such things as low as possible :slight_smile:

I’ve never used frozenset so disregard this message if experience is essential but:

I think the prefix notation f{a for a in iterable} is way too syntactically close to our beloved f-string syntax. The alternative suggestions like a + prefix also convey nothing to me. What’s wrong with:

or even frozenset(a for a in iterable) which feeds the iterable right into the frozenset’s constructor? Maybe it’s just because I don’t use frozensets so I don’t see them as ‘first class’, unlike tuple, list, dict, set, whence I am in favour of the (still debated) addition of the empty set literal syntax of PEP 802, {/}.


Maybe *{x for x in my_gen if x < 42}* syntax looks nice, like :snowflake: snowflakes freezing the set on both sides…

1 Like

That’s not clear and I could see individuals, myself included, thinking we are creating a set just to throw it away

Well for parity we can make - do the opposite, to thaw a frozenset/frozendict. :slight_smile:

But yeah I agree that this +{...} notation, like f{...}, is not something that can be intuitively understood and will have to be taught and learned.

If that is ruled out, I personally still prefer the previously suggested notation of {{...}}, which I find to be the most intuitively understandable syntax so far, with the caveat that nesting a frozenset literal/comprehension within a set/frozenset literal/comprehension, e.g. {{{...}}}, needs to be made invalid unless separated by a space, e.g. { {{...}} } to avoid visual ambiguity.

{{1}} also won’t really work. It will:

  • Conflict with existing syntax. Currently this is a valid syntax expression, it just always throws an exception, but it can be used to raise an inline TypeError for example. It can be used in tests or in fuzzing, etc. Breaking something is a bad thing (even such minor cases), when we can just create something new
  • It will require very ugly escaping in templates (probably?): '{{{{fdct}}}} is {0}'.format(fdct), because '{{fdct}}' right now is just an escaped version of {fdct} template placeholder
  • “needs to be made invalid unless separated by a space” is also a very complex concept, Python has very complex rules of significant spaces. Let’s not make another one for no reason
1 Like

I’ve actually tried very hard to find an example of such a fuzzing test on GitHub with many variations of regexes but couldn’t find any. Please point me to one if you do.

I think the actual impact is going to be rather insignificant.

I think you’re likely mixing up str.format with an f-string. The string template for str.format does not support arbitrary expressions.

I don’t think it’s that tricky since tokens already carry position info so the parser can simply raise a syntax error if the next token of a {{ is an adjacent { when attempting to match a frozenset literal/comp grammar rule.

1 Like

The idea of a frozenset/frozendict is a compact readable notation.

not goal:

  • Intuitive understandability. - Like with f-strings you have to learn the syntax once. As opposed to e.g. {}.freeze(), the compact notation will not allow to infer the meaning just by having a general understand of python.

goals:

  • readbility: The notation should be easy to identify and discern from other code constructs
  • rememberability: The notation should connect to the meaning of “frozen”
  • consistency: The notation should fit into general language patterns

Therefore, a char prefix notation seems a good choice: prefixes are used as modifiers to string literals. Using them also as modifiers for set/dict literals is consistent. f{} would fundamentally be a good choice, but since “f” is so promenent for f-strings, maybe we rather want to not reuse this letter for sets. Alternatives would be z{} (mnemonic: z is the second promenent sound in frozen) or i{} (mnemonic: immutable or iced)

Alternative: Two-char delimiters. Doubling braces is not intuitive. I would instead go for |{...}|. Mnemonic: | are the “walls” that protect the object from modification.

I would refrain from unary operators and in particular any use of * because it’s the operator for unpacking. {*{1, 3}, *{2, 4}} is already valid code so any notation with *{ will be confusing.

5 Likes

I find doubling braces intuitive because its extra thickness conveys a hardened image for me, but your mnemonic for |{...}| works for me too.

| is a common operator for sets and dicts though so it may be challenging for some to read this quickly: |{1}| | |{2}| | |{3}|

3 Likes

Of course it is possible as it was done here using special code in the production rules to provide syntax error suggestions for || and && (which are actually tokenized as two tokens), however this is done here only for invalid rules. And, yes, using the production rules for deciding if a rule should be accepted is a hack.

If you want to make it happen also for valid rules, that would be possible in theory. However that would mean that aspects of the valid syntax would not show up in the Full Grammar specification, defeating the purpose of a Full Grammar specification.

There was a discussion on that approach recently showing all the problems with the {{-approach.

1 Like

That’s a fair point.

How about:

{|1|} | {|2|} | {|3|}

Much easier to pick out the |s between the frozensets for me, and the shapes of {| and |} look like earmuffs that provide good insulation.

3 Likes

I’ve actually tried very hard to find an example of such a fuzzing test on GitHub with many variations of regexes but couldn’t find any. I think the actual impact is going to be rather insignificant.

This is not how it works :slight_smile:

GitHub is not the only place people store code. If there’s a chance that someone might rely on it, someone is. Breaking stuff for no reason is not a good idea :slight_smile:

I think you’re likely mixing up str.format with an f-string. The string template for str.format does not support arbitrary expressions.

If you want to print {{0}} in .format string, you would have to write:

>>> '{{0}} is {0}'.format({1, 2})
'{0} is {1, 2}'
>>> '{{{0}}} is {0}'.format({1, 2})
'{{1, 2}} is {1, 2}'
>>> '{{{{0}}}} is {0}'.format({1, 2})
'{{0}} is {1, 2}'

I don’t think it’s that tricky

It is tricky if you have some corner cases that you have to explain and support. f{} is not, because there are no corner cases. It works exactly like familiar {}, but produces a frozendict instead of a dict.

1 Like

Personally, I like the f{...} notation and I do not see the syntactical similarity as a problem, but to be honest there is another thing that some people (me not included) might not like.

People say “f-string”. The name “formatted string literal” is seldom used.

I expect we will start to say “f-set”, but not stop using the original “frozenset”, so it will be known under two names.

4 Likes

To me the syntactic similarity is a strength, unless we expect to run into sitations where we mistake f{x} for f"{x}".
I have had trouble in the past where an AI would systematically suggest "x, y, z" instead of "x", "y", "z", I had trouble spotting the difference because those two are so similar, and it may also have played a role for the AI.
But I don’t believe f"{some_stuff(we, want=to_format)}" (without any recognisable string-only stuff, and only a single formatted group) is actually a likely pattern for people to want?

I don’t say “formatted string literal” because that is 7 syllables. since “f-set” and “frozen set” are almost the same length, I expect “frozen set” would remain the common name.

1 Like

Should this proposal include tuple comprehensions, since they can be viewed as frozen (immutable) lists? A consistent “freeze this” syntax may be desirable, with optimizations as possible.

1 Like

If f{x} is too close to f"{x}", how about using the upper case version for better readability: F{x}.

F"{1}" is the same as f"{1}"

3 Likes

Yes for the f-strings it is the same, but F could be enforced for a new frozenset/frozendict literal. I do not think I ever encountered F for f-strings in the wild.

This reminds me that lists are unhashable. What would they freeze to under a f[1, 2, 3] syntax, a tuple? If there were a ‘freeze this’ syntax then what would a tuple ‘unfreeze’ (‘thaw’?) to, a tuple or list?

There’s a lot of desperate straw clutching in this thread debating syntax, f{, |{, {|, {{, *{, z{ etc. At least PEP 802’s {/} resembles existing mathematics notation for the empty set.

Not everything needs to have a syntax literal, like how Python doesn’t have the /my_regex/ syntax that eg JavaScript has. Instead one creates a ‘temporary’ string "my_regex" and feeds it to re.compile(). So similarly what’s wrong with frozenset({a for a in iterable}) and frozendict({k: v for k, v in items if foo(k)})? Is this thread really just trying to micro-optimize out the creation of the ‘temporary’ set and dict?

I’m -1 on the thread, not because I use frozensets, but because I don’t like how it adds more cluttering syntax to the entire language, just for one or two special containers.