Tuples with named items/elements

alextretyak · October 25, 2023, 12:00am

I think that collections.namedtuple is not Pythonic enough for some use cases.

Recently, I wrote code like this:

disciplines = []
for row in csv.reader(open('...csv')):
    level = 0
    while row[level] == '': level += 1
    disciplines.append((level, row[5], row[6].split('; ')))

for i, d in enumerate(disciplines):
    if i < len(disciplines)-1 and disciplines[i+1][0] > disciplines[i][0]:
        continue # this is disciplines' category — skip it

    _, discipline_name, compets = d
    ...
    for compet in compets:
        ...

It would be more readable to write disciplines[i+1].level > disciplines[i].level instead of disciplines[i+1][0] > disciplines[i][0].

This can be achieved by using named tuple:

    disciplines.append(namedtuple('_', 'level, name, compets')(level, row[5], row[6].split('; ')))

[And also line _, discipline_name, compets = d will become unnecessary (but discipline_name and in compets: just need to be replaced with d.name and in d.compets: respectively).]

But I think Python should allow a more concise and readable syntax:

    disciplines.append((level = level, name = row[5], compets = row[6].split('; ')))

Or even this:

    disciplines.append((level=, name=row[5], compets=row[6].split('; ')))

(According to this proposal.)

So I propose to permit construct tuple from keyword arguments, which will allow accessing tuple items by their name.

Tinche · October 25, 2023, 12:18am

Have you considered using a dictionary instead?

Also I don’t think you’d ever want to actually call namedtuple inside of a loop, the performance of that will probably be pretty bad.

Rosuav · October 25, 2023, 12:56am

Have you tried types.SimpleNamespace? It’s basically like you’re describing - a big ol’ pile of attributes - albeit without the iteration, so you can’t treat it as a sequence.

Alternatively, consider a dataclass, which you could easily make iterable.

>>> import dataclasses
>>> @dataclasses.dataclass
... class Discipline:
...     level: int
...     name: str
...     compets: str
...     def __iter__(self):
...             yield self.level
...             yield self.name
...             yield self.compets
... 
>>> Discipline(1, "foo", "spam")
Discipline(level=1, name='foo', compets='spam')
>>> list(_)
[1, 'foo', 'spam']
>>> Discipline(1, "foo", "spam").name
'foo'

alextretyak · October 25, 2023, 1:49am

Hm. It looks like exactly what I needed (I don’t need the iteration here). Thanks for the good advice!

Alternatively, consider a dataclass, which you could easily make iterable.

Just for completeness: the same can be achieved using typing.NamedTuple:

from typing import NamedTuple

class Discipline(NamedTuple):
    level: int
    name: str
    compets: str

print(Discipline(1, 'foo', 'spam'))
print(Discipline(level=1, name='foo', compets='spam'))
print(list(Discipline(level=1, name='foo', compets='spam')))

alextretyak · October 25, 2023, 2:32am

Yes, I have.
But I dislike disciplines[i]['level'] syntax (disciplines[i].level is more readable).

By the way. In JavaScript you can write:

disciplines = []
disciplines.push({level: 1, name: 'foo'})
console.log(disciplines[0].level)

Rosuav · October 25, 2023, 2:50am

Yeah, that’s exactly what either a dataclass or SimpleNamespace is good at.

d_n · October 25, 2023, 3:23am

+1 @Rosuav dataclasses and generators
+1 @alextretyak dotted-notation
These days I use dataclasses and rarely ever think about namedtuples with exactly the same preference for notation and keenly-using the ability to pass-in data according to keyword-arguments.
Am not a fan of SimpleNamespace because it lacks structure/is completely variable. Has a place - coping with dynamic situations. Here, the data-structure(s) are known in-advance and thus such information could?should be recorded in the code (rather than leaving some hapless future-soul to contend with an extra layer of ‘figuring things out’). (IMHO, YMMV, etc)

intellimath · October 25, 2023, 8:06am

If you look into the implementation, you can see that SimpleNamespace is just a proxy of dict with access by attribute instead of access by key.

gkb · October 25, 2023, 9:11am

I think you are not using namedtuple the way it is intended to be used.

What you would usually do is

Discipline = namedtuple("Discipline", "level, name, compats")

disciplines.append(Discipline(level=1, name="foo", compats="spam"))

alextretyak · October 25, 2023, 10:36am

IMO this makes sense only when Discipline is used more than once (i.e. at least twice).

In my case I don’t want to add an extra class (even if it is just a single line of code), because disciplines is an intermediate container of temporary items, which are not needed elsewhere. And disciplines.append(Discipline(...)) just looks silly (take no offense, please), and will become even more silly for e.g. some_specific_objects_with_a_very_long_name.append(SomeSpecificObjectWithAVeryLongName(...)).

emplace_back() was introduced in C++11 for a good reason:

std::vector<Item> items;
items.push_back(Item(1, 2, 3)); // before C++11
items.emplace_back(1, 2, 3);

gkb · October 25, 2023, 12:57pm

Right, I just saw you mentioning typing.NamedTuple and wanted to highlight that collections.namedtuple allows the same (minus the type hints)

PythonCHB · October 30, 2023, 4:40pm

Hmm – it sounds to me like the issue here is that you are using a generic container (e.g. list) but what you really want is a “Sequence of these specific things” – maybe write a class that is just that. I"d perhaps write it around dataclasses, so you could specify the list-of-stuff and the stuff at once.

Then you’d get something like:

@list_of_dataclasses
class DisciplinesList:
[define your fields here]

disciplines = DisciplinesList

disciplines.append(field1=this, field2=that, …)

though frankly, if you really are jsut passing this through, then simple dicts seem to me the right solution.

as for emplace_back - it woulnd’t be hard to make a list subclass with that method so maybe that’s even beter than my suggestion above – though wouldn’t you have to define the Item type anyway, which you doin’t want to do?

hansgeunsmeyer · October 30, 2023, 9:11pm

If you don’t want to define a dataclass or dict, you could instead define a little helper function (which doesn’t need to be accessible globally):

def _(level=0, name=None, compets=None):
    return (level, name, compets)

and call

    disciplines.append(_(level=..., name=..., compets=...))

Btw, all this would be a lot easier (and less brittle) if you’re able to use pandas to process the file.