Is it possible right now to make a user-defined class that is compatible with **d apart from by subclassing dict?
Generally I think it is better if these things are based on well-defined method-based protocols rather than nominal types. In the case of *d it is the iterator protocol. For **d it can be the iterator protocol as well except the expectation is to yield tuples for the pairs.
True, but this is already true for dict.__init__, dict.update.
And this sounds like sort of pretty much everything, where there is some sort of protocol and machinery is used after input to determine if the input is valid.
And part of those cases are also similar in a way that this can become apparent only after large amount of work has been done. Some of them apart from dict.__init__ and dict.update:
opr.attrgetter('a', object()). Internally it will need to iterate through *args.
isinstance(arg, (str, object()))
I can see that this is best avoided if possible, but given this is only an extension of already existent protocol and this has been done numerous times in other places, I don’t see how this can be a detrimental factor.
While these methods do accept multiple types, I think they do so for historical reasons. I think a more “modern” interface would have had dict.from_items instead of having dict support iterables and mappings in the same spot. Similarly, update wouldn’t need to take a mapping since |= supplants that use.
Interfaces that support multiple types in the same spot are problematic because:
they limit the type errors that are caught by type checkers, which reduces the errors caught, and
they result in code that is harder to read and reason about (the minor benefit to code writers is rarely worth making code harder to read).
I don’t think it’s good langauge design to try to make iterables of pairs work like mappings.
A single mapping, in my opinion. I realize many people would disagree with me. Just my opinion.
It’s also interesting that the usage dict(a=foo, b=bar, **c, **d) is a bug magnet whenever cor d don’t contain string keys. Some linters will ask you to use the dict display instead.
An iterable of pairs.
That’s fair, but my argument wasn’t just about idealism. I am arguing that limited interfaces are better in principle, whereas you’re arguing for more permissive interfaces. I don’t think the existence of permissive interfaces is evidence that permissive interfaces are better. They’re still bad. And I think we should avoid creating more.
I don’t have strong opinion on this. My approach is to digest as much information as possible, to collect as many use cases as possible and arrive at optimal implementation - sometimes more permissible is better, sometimes separating constructors/methods based on input type is better.
But I think I do agree with you that the latter, in practice, is much more often the case, at least when I look back it seems so.
Not really, I am concentrating on this specific case.
I don’t like hard rules like that too much. There are always exceptions. E.g. a case where theoretically (proven by science or whatever) only 2 types of input exist. But yes, I get what you are saying and agree to a large degree.
But again, I am more interested in the case a hand, given the situation that it is.
What I mean is, I don’t see any added value compared to simply creating a dictionary from key-value pairs. It doesn’t even improve performance. Unpacking does not cause rehashing (citation needed).
Also, the current behavior reflects a good separation of concerns. The ** operator does not need to handle creating dictionaries from key-value pairs.
class Key:
def __init__(self, name):
self.name = name
def __hash__(self):
print(f"Hashing {self.name}")
return hash(self.name)
a = {Key('a'): 1}
print("Unpacking...")
b = {**a}
Result:
Hashing a
Unpacking...
Overall, this would only slow down the ** operator slightly and make it a more complex, ‘magic’ syntax.
Maybe not in the way that you think I meant, but it does avoid intermediate dictionary when one is in a situation, where one has iterable of key-value pairs.
For {**d1, **pairs, **d2}, this would be as performant as:
d = dict(d1)
d.update(pairs)
d.update(d2)
Although the former might be more convenient syntax for some.
But some performance benefit would be there for func(**pairs) which is not available now.
Yes, I suppose hashing is much larger factor. Well maybe not that much larger for some optimized types, such as int / str (not sure about this), but it doesn’t matter as no rehashing is done and it is not what I wa referring to.
Maybe, maybe not. Depends on POV. If I am user in the situation where I have key-value pairs and want to source them as arguments, then my concern is to do it in most convenient way possible, and:
foo(**pairs)
could be more convenient than
foo(**dict(pairs))
for many users.
With additional performance benefit of not creating intermediary dict object and only needing 1 iteration instead of 2.
But how exactly does the interpreter determine what it is? Does it prioritize strict dicts, or all objects that implement the collections.abc.Mapping protocol, which really only requires methods of __getitem__ , __iter__ , __len__, which are also implemented by a sequence?
Also, don’t get me wrong, I am not hard-vouching for this, just taking time to respond to arguments that I don’t find convincing enough (regardless of whether they are for or against).
The user may recognize pairs of items in the iterable, but the interpreter only sees individual items, not key-value pairs.
This means updating continues until a failure occurs. The user would need to wrap it in a try/except block, similar to how update is handled, or create a dictionary beforehand.
At first, it seems like a convenient shortcut, but it also inherits all the complications of creating a dictionary from pairs.
Ah fair enough. dict.update indeed prioritizes objects that implement the mapping protocol. Since the proposal is simply about making the ** operator consistent with the existing behavior of dict.update, I think I’m in support of the proposal now.
I think this is one of those errors that are captured by global try-except with error message “Contact developer”.
Then this should also always be done?
try:
func(**dict(pairs))
except:
exit()
as it is prone to exactly same error.
It does not seem to have any additional technical issues compared to {**dict(pairs)} and func(**dict(pairs)).
At least none have been pointed out so far.
Ok, I will stop defending this and will leave this fun part to OP if he wishes to pursue this.
Personally, I am positive on this conditional it can be shown it would be useful enough in practice to the degree that the effort and additional complexity is justified.
If there is little usage, then I am +0.1, given there is agreement that dict.update argument type protocol is not changing. I just like consistency I suppose, given the opportunity costs can be inferred accurately enough and are minimal. E.g. the API is final enough to tell that it has little to no chance of obstructing better things in the future and similar.
Not necessarily. You can leave func(**mapping) outside the try/except block if you don’t expect any exceptions from func. This is more about guarantees that certain current syntaxes will never raise exceptions, as they are designed to always work.
d1 = {(0, 1): "foo"}
d2 = {(2, 3): "bar", (4, 5): "baz"}
d1.update(d2) # obvious, normal. works fine
d1.update(**d2) # fails, update() doesn't take keywords
d1.update(*d2.items()) # this already works
d1.update(*d2) # successfully updates, but not what you want
d1.update(**d2.items()) # this thread would make this work
update() does take keywords but it would fail because keywords have to be strings and tuples are not strings.
This proposal would make d1.update(**d2.items()) behave like d1.update(**d2) when d2 is a mapping, so it would fail in the same way as d1.update(**d2) because tuples are not strings and can’t be keywords.
I can see the benefit of this. Would it also make sense for this idea of “unpacking pairs” to work in the other direction, i.e. not just from a list of tuples to key/value pairs in a dict, but from key/value pairs in a dict to a list of tuples? Currently: