I’d like to propose this new simple abstract base class.
The reason #1 is that strings are technically containers/sequences, but practically are simple types.
This nested sequence: [1, [2, 3]]
contains 1, 2 and 3. That is what would we get after flattening it.
But does this nested sequence: ["one", [ "two", "three"]]
contains “one”, “two” and “three”?
Yes, but only if you agree with the reason #1. Otherwise the aswer is: a nested sequence of: “o”, “n”, “e”, “t”, “w”, “o”, “t”, “h”, “r”, “e”, “e”.
The reason #2 it that they must be special-cased each time a nested data structure is processed.
The code below works for sequences, except when there are strings inside.
def flatten(data):
if isinstance(data, Sequence): # <-- a place for the NonStringSequence
for item in data:
flatten(item)
else:
print(f"got: {data!r}")
Strings are special, because they are not strings of characters. There is no character type in Python. This leads to an inifinite recursion:
>>> msg="string"
>>> type(msg) is type(msg[0]) is type(msg[0][0]) # ... etc ...
True
>>> s="X"
>>> s is s[0] is s[0][0] # ... etc ...
True
IMO almost every programmer was bitten by this. After learning it we often write:
if isinstance(data, (list, tuple))
I found that 100+ times in the standard library - but I DID NOT check if it was for the reasons discussed here.