Interesting topics for an advanced Python lecture

yoavdw · July 23, 2024, 10:01pm

I was asked to give a lecture (with a presentation) to a group of people about Python.

This class already has a curriculum for Python, and it’s quite long and involves a lot of topics. I was asked to give an advanced lecture, which is not part of the curriculum, about anything I want.

This is not for people who have been developing in Python (or in general) for many years, so it’s not that advanced, but they’re quite sharp and it’s okay if not everyone understands everything thoroughly as long as they get the gist of it. I was also asked that it’d be at least somewhat practical for their use (e.g. I can’t do it about the JIT )

Last time I did it about how asyncio works, using almost exclusively A. Jesse’s and Guido’s article as inspiration: 500 Lines or LessA Web Crawler With asyncio Coroutines - but was told it’s a bit too complex for people to understand in just one hour.

I was thinking to do it about how to leverage type annotaions in Python, sttarting with a reminder about type annotations, showing cool concepts from typing (Optional, abc’s, ParamSpec, etc.), then moving on to runtime annotations using dataclasses and more advanced concepts like Annotated with pydantic.

I’m looking for more interesting subjects though. I specifically want things which might not be useful to put in a general Python tutorial, but are still cool. Can be about something in Python itself, or about specific libraries (alembic came up) and anything else that is not actually a part of Python. There was also an idea about showing dis and ceval a bit, but I deemed it not practical enough.

I thought this would be a great place to ask as there’s nowhere where people are more passionate about Python. Happy to hear any ideas you might have!

Rosuav · July 23, 2024, 11:34pm

Performance testing: what to do, what not to do, and how to interpret the results. This is something of a recurring topic; anyone can run something a million times and get some numbers, and then claim to have discovered something, but a lot of people don’t realise that (a) there are right and wrong ways to get those numbers, and if you do it wrong, your numbers are meaningless; (b) sometimes, even when you do everything right, your numbers are still meaningless - and you need to be able to know when that’s happened; and (c) how to share your results so that other people can understand them, verify them, and perhaps do something with them.

This is a HUGE topic, so a one-hour lecture probably won’t do more than scratch a bunch of surfaces, but there should be enough in there that people can really get an idea of what’s fast and what’s slow. For example, defining a function is a lot slower than adding two numbers - but how much slower? Is adding 1+1 in a loop actually doing what you think it does? → spoiler ^[1] spoiler ← Is it faster to look something up from a dictionary more than once, or to put it into a variable and use that twice? And more importantly, is it enough faster to be worth writing that way?

Plenty to cover, and could be of great value to people!

Almost certainly not, if you use two constants - which you can verify by replacing it with the number 2 in that same loop ↩︎

yoavdw · July 24, 2024, 12:58am

Thank you so much for the idea!

I really appreciate your response, and was specifically hoping for you to respond as I know you’re really proficient with a lot of different areas in Python .

I agree it’s a huge topic and want to gather some keywords and topics to see what I want to touch on and that I haven’t missed anything. I don’t really need resources to read but more like “search terms”.

I currently have:

cProfile
gc module
memray package
pyspy package
locust package
caching

Do you happen to have a few more topics off the top of your head? Not all of them have to be super relevant, just anything you think might be interesting and I’ll read more about it.

Thank you!

yoavdw · July 24, 2024, 1:01am

P.S Still looking for more ideas as I might do more than one of these lectures

Rosuav · July 24, 2024, 4:51am

The biggest one would be the timeit module. You can do a lot just with that, without needing any third-party tools or anything.

flyinghyrax · July 24, 2024, 1:45pm

I’ve learned a lot about Python from studying these, and had fun introducing them to people:

The Descriptor protocol (how things like property work)
The Dunders™️ (you can make classes behave in so many different ways)
Decorators (function based, class based, parameterized, etc)

I think Chris’ performance/profiling idea is much more practically useful. Also, things like descriptors, decorators, and other kinds of runtime “metaprogramming” tend not to play nicely with type annotations.

willingc · July 24, 2024, 3:10pm

In addition to Chris’ suggestions for performance, I have discussed some performance resources in recent keynotes:

Embracing Python, AI, and Heuristics: Optimal Paths for Impactful Software - Speaker Deck
Pragmatic Python: Python 3.12 and beyond - Speaker Deck Itamar’s work on pythonspeed.org is very helpful.

TeamSpen210 · July 24, 2024, 10:01pm

It could be interesting to do a lecture on generators/itertools, ways to combine the building blocks to produce efficient/lazy computation.

Rosuav · July 24, 2024, 10:10pm

Ooh yes, particularly in the one situation where it’s obvious there’s no way you could use a regular list: infinite generators. Like, you can’t make a list of all primes, but you CAN make a generator that yields successive primes without limit.

bwoodsend · July 25, 2024, 8:28pm

Might not sound like much fun but how about 101 common portability bogies focusing on all the things people don’t realise when they’re coddled in their favourite IDEs on a single idealised dev machine such as:

Using relative paths to locate files next to Python scripts (e.g. open("my-application-resource.txt"))
Anything along the lines of subprocess.run(["python", ..."])
Using sys.stdout.write() without checking that sys.stdout is not None (breaks under windowed mode)
Using assert for error checking
Using regex on file paths
Using sets/dicts/string comparisons on file paths without normalizing them first
Using exception handling for the wrong reasons

These are all things that are never taught in schools but come up everywhere in real life.

adendek · July 26, 2024, 5:03pm

Considering that foundational topics such as unit testing have already been covered in the curriculum, I would like to propose the inclusion of metaprogramming as an advanced topic for the upcoming Python lecture. Metaprogramming is a powerful concept that allows for the dynamic creation and customization of code at runtime, offering deep insights into how objects are created and manipulated in Python.

To provide a comprehensive understanding of metaprogramming, I recommend utilizing these two resources as reference materials for the lecture:

An insightful tutorial available on IBM Developer, which offers a thorough exploration of metaprogramming concepts within Python. You can access it here. This resource is particularly useful for understanding the theoretical aspects of metaprogramming and its practical applications in Python development.
A video lecture by NextDayVideo, available on YouTube here. Despite being somewhat dated, this lecture remains highly relevant and provides a dynamic presentation of metaprogramming concepts. It serves as an excellent visual aid and can help students grasp the intricacies of metaprogramming through real-world examples and demonstrations.

Incorporating metaprogramming into the advanced Python lecture will not only enrich the curriculum but also empower students with the knowledge to leverage Python’s dynamic features more effectively. Understanding metaprogramming opens up new possibilities for code optimization, design patterns, and can significantly enhance the students’ ability to write clean, efficient, and maintainable code.

cben · July 30, 2024, 11:20am

Debatable how “advanced”, and don’t know if it can fill a lecture, but there is a particular approach to interfaces and OOP I’ve seen pretty specifically in Python, and imho useful to shine a light on:
Instead of fixing a single contract between caller and callee, have an intermediary mediate between a public contract for callers & separate contract(s) for implementations.

For example, you don’t call obj.__iter__() directly; you call builtin iter(obj) (or use builtin for syntax) which usually calls obj.__iter__() for you—but sometimes has other strategies e.g. fallback to generating obj[0], obj[1], obj[2]… for sequence-like objects that didn’t implement __iter__.
Similarly the A+B is not simply syntax for “send A the + message” like in some languages; it’s shorthand for operator.add(A, B) algorithm that tries A.__add__(B) but also B.__radd__(A).

Python does this all over for syntax and builtins, and also some libraries. Over the years, it has proven useful for:

Backward-compatible introduction of new contracts.
E.g. IIRC the obj[0]... fallback was necessary to introduce the iteration protocol in the first place; before iterators were a thing, the for loop already supported custom sequence types by doing subscription with increasing indexes, and that behavior had to be retained — and wrapped in a new builtin <iterator object ...> type — to redefine for in terms of “calls iter()”. PEP
Evolution of implementation contracts.
E.g. obj.attr syntax aka getattr(obj, 'attr') interface remained fixed while new forms of customization were grown over the years (new-style classes, metaclasses, descriptors, __getattribute__ vs. __getattr__…).
Evolution of public contracts.
To follow same iteration example: Later with added next(iterator) intermediary in preference to calling iterator.next() directly, it became easy to also offer next(iterator, default) mirroring existing conveniences like d.get(key, default) and getattr(obj, attr, default) etc.
- That PEP chiefly wanted next() to paper over a Py3 breaking change in implementation contract; with benefit of hindsight I don’t know if that breakage was worth it , but either way it did also enable “Evolution of implementation contracts”. next() was backported to 2.7 to allow writing 2/3 agnostic callers.
  With same hindsight, we can also say that would have gone smoother if that intermediary existed from the start, as matter of consistent approach; but it was hard to justify a 2nd builtin, before any concrete benefit was on the horizon.

I focused above on Python itself — builtins and syntaxes — which are less relevant as design guidance but somewhat helpful framing for learning these corners of the language and it’s history?
But I feel same approach has a place for libraries and any situation where you design a protocol anticipating multiple implementations × multiple callers, worth teaching people after they’ve grasped the basic “duck typing” approach. Adding concrete examples left as exercise to the lecturer

csm10495 · August 15, 2024, 5:46am

It could be fun to have a lecture on examples of people abusing different things.

I see type hints used in fun ways on here… Thankfully not in my own code.

I’ve also seen __repr__ have confusing side effects (like exit the program).

I’ve also seen a repo that called malloc and free in python to get ctypes arrays (instead of allocation it the normal way via ctypes.

People assume __del__ gets called at expected times.

Sometimes people use stack frames to fetch locals up the stack.

I’ve seen atexit do some confusing things (like restart python itself.)

Doing blind try/excepts without logging the error.

Some multi threaded code assumes the GIL exists. Probably better to use Lock since one day hopefully we won’t need the GIL.

Monkeypatching the stdlib deep inside a library without the user asking for it.

Lots of other weird code things that people can think of.

I guess the point of the lecture is Python has a lot of power: use it for good instead of confusing people. A lot of the time classes try to only show the good. I reckon it may be good to show some of the bad to prepare them for the real world.

AlSweigart · October 27, 2024, 2:31pm

I think the Python Data Model would be a good general topic. It’s really abstract, but graspable if you have plenty of programming experience. It’s good for ensuring you have the correct mental model of how Python works.

The docs: 3. Data model — Python 3.13.0 documentation

Ned Batchelder’s talk on this: Facts and myths about Python names and values | Ned Batchelder

A talk I gave: https://www.youtube.com/watch?v=EVBq1boGP6s