Add `OrderedSet` to stdlib

I’m in favor of having OrderedSet somewhere in the standard library. Probably collections.

Suppose we don’t want to add dependencies to a projects just to support the OrderedSet collection type object.

If we don’t want external dependencies the best way I know to implement an ordered set (without writing a new class which would be annoying to do in multiple packages) is to use the keys of a dictionary. Say my ordered set contains 'apples' and 'oranges'. I can do

ordered_set = dict()
ordered_set['apples'] = None
ordered_set['oranges'] = None

ordered_set will then act in some nice ways like an ordered set. For example we can do things like for element in ordered_set... and it will behave as expected. One major shortcoming is when we want to printout the ordered set: print(ordered_set) will reveal the ugly dictionary structure. This ordered_set variable also lacks typical set syntax like ordered_set.add('bananas'). I doubt set operations like unions or intersections are easily available either.

There are perfectly acceptable external packages for the OrderedSet like ordered-set · PyPI and orderedset · PyPI and others. But all of these require adding an external dependency to a package for such a small purpose. It seems OrderedDict has been a part of collections for a very long time and now even dict objects themselve are natively ordered. Why do I need an external dependency for ordered sets but not dicts?

I don’t know much about performance considerations but my understanding is that the ordered sets may be slower than regular sets. I read in a StackOverflow post (don’t have the link right now) that the cython implementation orderedset package is 5x faster than the python implementation and 5x slower than regular sets. If performance is a concern it seems like it would make more sense to, at least for now, keep a seperate OrderedSet object around in collections rather than making existing set objects ordered (like how regular dict objects are now ordered).

I haven’t contributed to Python before. @pylang mentioned “you might need a bit more rationale”. Are there other types of rationale we should be thinking about other than what I’ve mentioned above?

1 Like