Ideas for a 100 Python Mistakes book

Does it matter? How often do you use the id and dir builtins? Who cares if you shadow them inside a short function?

One of the more interesting, and I mean that as a good thing, design choices of “Refactoring (Ruby Edition)” by Fields, Harvie and Fowler (that’s Martin Fowler) is that they will often recommend one refactoring technique, and then immediately recommend the opposite.

E.g. they have

  • Decompose Conditional
  • Recompose Conditional
  • Add Parameter
  • Remove Parameter

I think that shadowing is a good example for when this juxtaposition is helpful:

  • Don’t shadow builtins.
  • Don’t be afraid to shadow builtins.

Sometimes we intend to shadow builtins. Shadowing is not just a mistake, sometimes it’s a feature, and the mistake is to avoid it unnecessarily.

It is a mistake to shadow len or list in the top level of your module, where it has the potential to break your code in all sorts of places.

But its also a mistake to use an unclear or unnatural name as a local variable inside a short function merely to avoid shadowing a builtin you don’t care about.

Why use a worse name just to silence some opinionated linter or colleague? :wink:

Rules like “don’t shadow builtins” exist so that you think before you break them.

2 Likes

a technique I picked up from older tutorials is that if you really don’t want to shadow something that’s already defined you add an underscore as in len_ for len, I don’t use it that often though

+1. There’s good reasons to shadow builtins. There’s also good reasons to have your editor highlight builtins in a different colour, so you don’t unexpectedly find that you’ve shot yourself in the foot.

Spot on. For the mental model, the pass-by-object-reference to functions and methods is also important for python understanding.

2 Likes

On forums connected with introductory Python tutorials, I have often seen participants thinking that if they pass a variable to a function, that the function receives a reference to the variable itself, and that therefore the function should be able to change the value of that external variable, via something such as an assignment to the corresponding formal parameter. Consistent with that belief, when they observe that modifying a mutable object within the function can modify the external object referred to by the argument, they sometimes think it was because the external variable itself was modified.

If the book does contain a section on mistakes concerning references to, versus copies of, objects, that section would be especially interesting to beginners. In order to keep that component of the book’s audience happy, free of misconceptions, and ready to move forward, it would be helpful to explain at the start of that section, with the aid of a diagram and example code, the mechanism of pass-by-object-reference.

EDITED for clarification via using the phrase “assignment to the corresponding formal parameter”.

@Quercus great summary. Its not necessarily an easy topic for a beginner book, however one that is really important IMHO to gain competence. Its also a diffeent model to other languages (that I’d used in the past). I think it was a good post on SO that cleared it up for me, and I recall seeing some diagrams too - will see if I can dig up the reference.

In the meantime, here’s a great example of all 100 mistakes in one script!

1 Like

Then the concept must by necessity be introduced really early in the book - right on the cover, in fact. Instead of merely listing the name of the author there, have it state:

Written by a man named David Mertz

:grin:

1 Like

You may have seen some diagrams here …

… along with …

“Hamlet was not written by Shakespeare; it was merely written by a man named Shakespeare.”

And “…proclivity toward double abstractions”. Gold :slight_smile:

These also may be useful:
https://nedbatchelder.com/text/names.html

In the answers on this one:

1 Like

Overusing lambda. You’re almost always better with a list comprehension, generator expression, or something from the operator module.

Overusing regexes. Python is not Perl (fortunately). Usually if things can be done using str methods instead, that’s a win.

1 Like

Thanks for the links. All of them offer ideas that might be useful for the book.

Note that Programming FAQ: How do I write a function with output parameters (call by reference)? from the official Python documentation states:

Remember that arguments are passed by assignment in Python.

The object reference, of course, is the thing assigned, making that equivalent to stating that it is pass-by-object-reference. But is a new learner likely to recognize that? Somewhere I noticed the passing of arguments in Python described as pass-by-value, with the value that is passed being the object reference, however I cannot remember where that was stated. The book could take on the challenge of explaining this variety of terminology that has been used to describe the same process, ultimately standardizing on the terminology that describes it best, namely pass-by-object-reference.

I would hope that a new learner would read the next sentence after the one you quoted, which says:

Since assignment just creates references to objects…

and then goes on to give a very thorough explanation.

The next sentence in its entirety is this, and I’m not sure the beginner would understand all of the terminology within:

Since assignment just creates references to objects, there’s no alias between an argument name in the caller and callee, and so no call-by-reference per se.

For example, the beginner might not know what an alias is. The very thorough explanation that follows it is also good, and I’m not critical of it. But a beginner would probably need some additional help in order to understand it.

1 Like

These are so common I’m sure you have them, but I’ll list them anyway

  • Mutating an object you are iterating over
  • Expecting assignment to make a copy
  • Using a mutable default argument
  • Confusing list.append() with list.extend()
  • Expecting floating point numbers to be able to represent all decimals (possibly Python specific since decimal.Decimal exists)
  • Mutating a list rather than using a comprehension
  • Not understanding the difference between is and ==
3 Likes

Thanks Matt.

I’ve looked at the Go title already, which was apparently very successful (and I agree that Teiva has done a very good job). I have not seen the Java one, although from your description it doesn’t seem like I’m missing a lot.

I’ve added you to acknowledgements, although I think everything you suggest is already in the TOC. The idea of “use the right library rather than rolling your own” is touched on a couple times in different “mistakes” (not ones I’ve actually written yet, but I have the topics).

I’m not going to do packaging. It’s too big, and there are too many opinions. But specifically saying that I’m not doing it is something I should add to the front matter, so thank you.

Thank you, but there’s no need to do this if it feels weird to do so. I was just chiming in because I like to. It’s fun. I’m not trying to claim credit for other peoples’ work.

I’m looking forward to seeing this book. I’ll keep an eye out for it!

In general, what you think is efficient in Pandas probably isn’t actually the best approach. And what you benchmark as efficient in Pandas N.m will probably be different in Pandas N.m+1. I’ve taught a lot of numeric Python, and this is… well, complicated.

I had all the others, but not “Mutating an object you are iterating over” … that’s worth adding. Thanks.

2 Likes

Don’t worry, you still won’t get actual royalties from being acknowledged. :-). It’s of limited benefit, but feels respectful. I hope I have everyone in this thread who has suggested anything helpful… even the things that were already in my TOC.

Ah, so now we have a Python mistake:

  • Thinking you understand pandas performance :wink:

/me resists the temptation to say that the mistake is using pandas :innocent:

1 Like