Thank you, but there’s no need to do this if it feels weird to do so. I was just chiming in because I like to. It’s fun. I’m not trying to claim credit for other peoples’ work.
I’m looking forward to seeing this book. I’ll keep an eye out for it!
Thank you, but there’s no need to do this if it feels weird to do so. I was just chiming in because I like to. It’s fun. I’m not trying to claim credit for other peoples’ work.
I’m looking forward to seeing this book. I’ll keep an eye out for it!
In general, what you think is efficient in Pandas probably isn’t actually the best approach. And what you benchmark as efficient in Pandas N.m will probably be different in Pandas N.m+1. I’ve taught a lot of numeric Python, and this is… well, complicated.
I had all the others, but not “Mutating an object you are iterating over” … that’s worth adding. Thanks.
Don’t worry, you still won’t get actual royalties from being acknowledged. :-). It’s of limited benefit, but feels respectful. I hope I have everyone in this thread who has suggested anything helpful… even the things that were already in my TOC.
Ah, so now we have a Python mistake:
/me resists the temptation to say that the mistake is using pandas
Yes, recall reading the same somewhere too.
Coming from C my mental model is that all variables in python referring to a mutable type are simply pointers though in their use they are automatically dereferenced. In assingment and calling they are just pointers. Immutables are generally names with values though behave as alises in assignment and pass-by-value in calling. Haven’t loked at python implementation but presume it something like that.
@DavidMertz not sure on the intended audience though assuming beginner-intermediate, this whole topic could be a bit difficult to unpack in a single one of the 100, though is likley fundamental to a few of them. Perhaps a prelude or appedix chatper would be a good anchor for the list - touching on python foundation concepts that may be different from other languages and/or are general pitfalls.
I have some in mind:
enumerate
at allf"Hello {name}"
for i in range(len(my_list))
except:
(it was already said)from any_single_module import *
(except in quick manual tests, or in modules)__all__
in a moduleimport braces
I listed only some of “mistakes” that are purely Pythonic.
How about a section on improper overloading of comparison operators when defining classes?
This can have various adverse consequences. For example, if a __lt__
method is poorly defined for a class, it can create problems for sorting a list of instances of the affected class.
The comparison operators should be transitive, meaning that if a < b
and b < c
both evaluate to True
, then a < c
should also evaluate to True
. The following example violates that requirement, which has adverse consequences for sorting.
In the logistics of the hand game of Rock, Paper, Scissors:
See:
The code below uses a __lt__
method to implement the logistics of the game. But in doing so, it creates a cyclic order, which is not transitive. An attempt to sort a list of instances of the defined RPS
class runs afoul.
import random
class RPS():
num_val = {"rock": 0, "paper": 1, "scissors": 2}
def __init__(self, tool):
self.tool = tool
def __lt__(self, other):
return (RPS.num_val[self.tool] - RPS.num_val[other.tool]) % 3 == 2
def __eq__(self, other):
return self.tool == other.tool
def __repr__(self):
return f"RPS(\"{self.tool}\")"
r = RPS("rock")
p = RPS("paper")
s = RPS("scissors")
# demonstrate the cyclical ordering of RPS objects
# these six evaluate to True
print(r > s)
print(s > p)
print(p > r)
print(r == r)
print(p == p)
print(s == s)
# these six evaluate to False
print(r < s)
print(s < p)
print(p < r)
print(r == s)
print(s == p)
print(p == r)
# make a list of twelve RPS objects
playing_pieces = [RPS(random.choice(("rock", "paper", "scissors"))) for _ in range(12)]
# try to sort the list of RPS objects
playing_pieces.sort()
print('\nThe "sorted" playing pieces:')
for piece in playing_pieces:
print(piece)
The portion of the output relating to the sort was as follows:
The "sorted" playing pieces:
RPS("rock")
RPS("paper")
RPS("paper")
RPS("scissors")
RPS("scissors")
RPS("rock")
RPS("rock")
RPS("rock")
RPS("paper")
RPS("paper")
RPS("paper")
RPS("paper")
Only if the data you are modelling is itself transitive. If it isn’t, then using non-transitive comparisons is fine.
Such comparisons model real-world relationships, e.g. people’s preferences, games (the best games are non-transitive), pecking orders and herd-animal hierarchies.
Transitivity is over-rated
The consequences of non-transitivity are a bit unusual, but that’s inherent in the situation. There is no unique “sorted” order for such non-transitive relationships, but the good news is that we can always find some order which obeys the relationship.
I haven’t proven it, but I guess that even plain old sorted()
will give a result which obeys the non-transitive relationship locally. That is, after sorting your list of [Rock Paper Scissors]
list, every pair of consecutive objects [a, b]
will obey the relationship a <= b
. Can anyone confirm that?
I haven’t either, but I’ve tested it.
import random, itertools
class RPS():
num_val = {"rock": 0, "paper": 1, "scissors": 2}
def __init__(self, tool):
self.tool = tool
def __lt__(self, other):
return (RPS.num_val[self.tool] - RPS.num_val[other.tool]) % 3 == 2
def __eq__(self, other):
return self.tool == other.tool
def __repr__(self):
return f"RPS(\"{self.tool}\")"
rps = [RPS("rock"), RPS("paper"), RPS("scissors")]
for _ in range(100000):
items = [random.choice(rps) for _ in range(100)]
items.sort()
for cur, next in itertools.pairwise(items):
if next < cur:
raise Exception("Wrong order: %r, %r" % (cur, next))
No exception raised. Not mathematical proof though.
For an ecological example involving vegetation, there’s cyclic succession.
See:
… and you did.
We’d best omit the RPS example from the 100 Python Mistakes book.
@DavidMertz the thread is pretty old already but I was surprised to not see anyone mention usage of sys.path.append
to “make imports work”. I was pretty convinced this was an anti-pattern that is pretty common in scripting but that I have too often seen creep up into actual software.