I just found TimeComplexity wiki page, which I think is great. However, it covers just list, set, dict and collections.deque. Discoverability is also an issue. I think it would be great if we started including time and space complexity in the official documentation for most (all?) builtins and stdlib.
I think most experienced Python developers probably know these for the common things. But I am also pretty sure that most people don’t know these things for constructs that you use less frequently, for example graphlib.TopologicalSorter.
It would be a lot of effort to add these, but I believe these don’t change often.
I think there would be multiple benefits to this:
It would be easier to write well-perfoming code when you could easily find this information
It would help you come up with time and space complexity numbers for your own code (for example when interviewing, or for school assignments)
It could point someone to provide improvements if they noticed from the documentation that something was implemented with suboptimal algorithm
AI tools could take advantage of this information
Would this be something you’d want to see? Or do you think it would be just clutter, and people should just profile code if they thought it was too slow, and go from there?
What surprising you have found here? Despite some naming quirks (mentioned in the Glossary) — everything seems to behave as you would expect from such data structures. The deque docs also explain difference wrt lists.
This might make sense on case-by-case basis. But note, that in general it’s implementation details.
Maybe it wasn’t clear, but I consider those four classes commonly used, and as such most experienced Python developers probably know these. If someone was new to Python, and did not read the link you mentioned, they might be surprised by list.insert(). But it is actually documented, which is great.
By implementation details I assume you mean differences between CPython, Jython, PyPy etc. So if the documentation at https://docs.python.org should cover any implementation, then it would not make sense to add time and space complexity there, unless we specifically want to lock down some implementation details that way.
It seems like if we wanted comprehensive complexity documentation by implementation it would need to live elsewhere.
I meant that the CPython docs is not a replacement for a textbook on CS. All we need is to guide people how Python types map to common data structures.
Yes, it should. CPython-specific details are documented with a specific note type. See e.g. the id function.
That’s kinda the canonical example of O(n) operations, though. It’s exactly why insertion sort runs in O(n²) time.
I think the wiki is the right place for this. To a lot of programmers, this will be completely unexciting; they just want to write idiomatic code and don’t really care about the asymptotic complexity of specific method calls. (For example, if you’re not working with ginormous lists, the exact behaviour of these methods is utterly irrelevant, and your own code will have far more impact on the run time.) That page was last edited in 2023; perhaps you’d like to be the one to bring it into the future?
I think the time complexity of operations is an important aspect of the implementation of the language. I’d like to see it in the formal documentation someplace. The CPython docs already include implementation-specific aspects, and the free-threading documentation effort will be grappling with how to document threading guarantees. Perhaps we can adopt a common approach to these kinds of details.
The wiki to me seems too informal for this kind of information.
I like complexity tables in C++ docs. E.g. deque O(1) random access can be clearly seen. Takes more effort to figure out the same for Python.
I don’t think it is a bad thing to expose users to some complexity details. If they don’t want it, they can skip, but I think that such knowledge helps making better decisions.
Historically, I believe the problem has been that while C++ considers time complexity to be a part of the documented contract for the language, Python considers it as an implementation detail (and not something we want to commit to across versions). I think we’d need some way to make it clear that unlike C++, Python doesn’t guarantee that complexity will remain the same across versions.
But all that being said, I too would like to see more clarity on the complexity of various operations in Python.
I think there are guarantees we’d absolutely want to commit to across versions. As a hypothetical example, if someone proposed a change that made dict lookup worse than O(1), do you think it would be accepted?
Other implementations might have different time complexities in their built-ins. Those would be important details to clarify in those implementations.
This exact discussion happened back when MicroPython was new. Is O(1) string subscripting a language guarantee or an implementation detail? Turns out, it’s an implementation detail, and uPy can store strings internally in UTF-8 without breaking language requirements.
So, maybe such a change wouldn’t be accepted into CPython, but making it a documented feature of the language would also constrain other implementations.
This would be my concern as well. We would need to clearly distinguish such things as descriptive documentation of CPython implementation details rather than normative documentation of Python behavior.
But that’s not the only question. Another question is, if someone makes an alternative implementation and that implementation has a dict lookup worse than O(1), is that alternative implementation still Python?
Almost certainly yes, for the same reason as uPy’s O(n) strings are valid. So all these things have to be documented as implementation details, not guarantees, and they could change in different versions even within the same interpreter. That’s why I think this is best done on the wiki; it can be exactly as up-to-date as people are willing to put in the effort to make it, without imposing demands on future versions or other implementations.
(Plus, the wiki can always do with more content. I’m never gonna say no to adding useful content to the wiki.)