Collecting more feedback about contributing to Python

Have you tried contributing to the development of Python itself, or have considered doing so? I’d like to hear your thoughts and experiences! I’m collecting such information to guide work during the upcoming core-dev sprint on making contribution easier and friendlier.

You can reach out publicly or privately. I’ll keep private stories to myself, only mentioning specific relevant points from them without mentioning who sent them.

Public stories will be added to the dedicated repo, which already includes many such stories which I have previously collected.

For more info on the core dev sprint, see the dedicated information website.

2 Likes

Tal, how would you like to hear thoughts?
Replies here? Github issues? A Google form?

For public stories, replies here work well. You can also send me direct private messages here.

You’re also welcome to create a PR for the contribution stories repo if you prefer.

Hi Tal,

I don’t know if I’m missed the sprint, but I’d like to give you my
experiences.

TL;DR:

I am a junior core dev who has been absent from active development (not
precisely by choice) for three or four years now. For reasons that will
be explained, I have to effectively learn (or relearn) all the processes
from (not quite from) scratch. In my experience, the barriers to being
able to contribute are much higher than they were when I wrote my first
patch for CPython.

(Or maybe I’m just too old and not sufficiently motivated to climb
that hill.)

Hi Steven, you haven’t missed the sprint, it begins next week.

I’d love to hear more details about your experiences, including specifically the barriers which seem higher today.

I can chip in on this one just a tad:

I’ve once submitted a pull request against cpython, and some windows build step failed.
But I was not allowed to see the logs for that step. So what could I do?

On one hand, automation and restrictive gating saves developer time, but it’s tough on newcomers, it’s hard to understand what’s going on and what could be wrong. For example, I still have no clue what all those different bots do and how to use them.

I think there’s also a vicious cycle when it comes to new versions.
(Sorry if I’m just too dumb to have figured it out on my own, I’m not asking for help)

  • Python 3.9 was in beta or rc phase
  • I wanted to test my contribution using user code
  • I’ve tested on Mac, and wanted to test on Linux, ergo Docker
  • User code needed a dependency, dep uses CI flow
  • CI was set up for 3.8 but not 3.9, because there were no official (smth build chain)
  • Pypa response was that ABI was not yet stable, so they would not provide (smth)

Having been essentially told “no” by multiple individuals with authority, I gave up on the idea.

Dima, thanks for telling about that experience!

Would you mind mentioning the specific PR so that I could get a better understanding?

Also, since you’ve posted this publicly, would it be okay if I added this to the contributor stories repo?

Sure go ahead and add it, the pull request was https://github.com/python/cpython/pull/19402
Today, if I try to view the failed test run, I get The logs for this run have expired and are no longer available. which I guess is another problem considering how long the entire process of pr/review often takes!

1 Like

What happened to the rest of my post???

I had a long and detailed explanation following the “TL;DR” and it’s
just disappeared. I wondered why Tal asked for more detail. I thought to
myself “How much more detail do you want?!?” but now I understand.

Grrrr. I shall try resending it.

Discuss seems to mangle emails something shocking. Deleting quoted text,
inconsistently mangling linebreaks, and now deleting almost my entire
post.

A thought comes to mind: I set off the longer discussion from the TL;DR
with a row of hyphens. Everything below the hyphens was deleted. Time
for an experiment: the next non-blank line will be five hyphens,
followed by a line of text. Let’s see if it is deleted.

Hi Tal,

I wondered why you asked for more detail. Here’s the detail which I
initially sent, but Discuss ate. (This time I won’t offset it with a row
of hyphens.)

When I first began contributing to CPython, the process was very
simple, simple enough that somebody with no professional programming
experience could get started:

  • I already had Python installed, including the .py source files.

  • I made a backup copy of the .py file, and edited the original.

  • If I messed up, I could revert to the backup.

  • Once I was happy with my edits, and the Python tests still passed, I
    looked up how to use diff from the command line to create a patch
    file, and uploaded that to b.p.o.

The hardest part was remembering which order the files should go for
diff: is it diff original changed or diff changed original?

Now of course I appreciate that this was not so simple for the core
devs. They have to review the diff, apply it, confirm that the tests
still pass, etc. But as a contributor, the process was about as easy as
it is possible to get. The barrier to entry was close to zero for anyone
on a Linux system, as I was.

(I guess it may have been a little higher for Windows users, if they did
not have diff available.)

Since then, development moved to hg, then to git. Each change has lead
to a significant increase in complexity, something which full-time
programmers may not even realise since they may be so familiar with the
process that they don’t have to think twice about it.

I am not a professional programmer. I have, occasionally, been paid to
program, but not for some years now and even then only as a very small
part of my duties. Shifting from “just submit a patch” to “use hg” was a
big jump in complexity for me, but I could understand the model and get
it to work.

I wasn’t an expert, but I was able to push through changes to the master
repo without seriously breaking anything, and after PEP 450 (statistics)
was accepted, I was given core dev permissions.

Just as I was getting comfortable with hg, two major things happened:

  • development moved to git;
  • and I got ill with a serious auto-immune disease.

Between the two, I lost all momentum. To make things harder for me,
Github stopped supporting my OS and browser and due to financial
difficulties I wasn’t able to upgrade my system until recently, so for
three or four years I was effectively incapable of contributing even if
I had the time and inclination.

Because of my long absence from making active contributions, I have
forgotten everything I knew about the process and have to relearn from
(not quite from) scratch.

I understand that CPython is now big enough and complex enough that we
cannot realistically go back to “just upload a diff”.

Even if I could upload diffs to b.p.o. for someone else to deal with, I
don’t want to be That Guy who won’t follow the standard process. I want
to contribute, and I want to pull my weight when I do so, not make more
work for others.

I expect that most contributors to CPython are professional or full-time
programmers who know git well enough that there are no significant
barriers for them. But for the rest of us, the amount of stuff you have
to do before you can contribute your first line of Python code seems to
be a lot bigger now than when I started. I’m not sure this is an
accurate of complete list:

  • install Python
  • install git
  • create a github account
  • create a b.p.o. account
  • set up a ssh key
  • configure github to accept it
  • fork the CPython repo
  • download your fork to your PC
  • fork your fork
  • make your changes
  • ensure the tests still pass
  • write a What’s New entry
  • update the docs
  • push your changes to your local repo
  • push them from your local repo to github
  • make a PR
  • make a b.p.o. issue
  • link the PR and the issue
  • sign a contributor agreement
  • wait for review
  • get the changes accepted

Have I missed anything?

Github is not as user-friendly for beginners as some git experts seem to
think. For example, I have a fork of the CPython repo dating back to the
initial change-over from hg. Last week I spent an hour trying to find
some way to update my fork to be up to date with the current version
before giving up. I’m sure that I will solve the problem, it’s a matter
of getting sufficiently motivated to put aside the time and energy
(another hour? two? five minutes? no way of knowing). But it’s another
barrier to getting productive.

Thank you for reading.

You may be aware of it already, but I periodically find myself referring to the devguide’s git boot camp, where it covers this case:

Scenario:

You forked the CPython repository some time ago.
Time passes.
There have been new commits made in the upstream CPython repository.
Your forked CPython repository is no longer up to date.
You now want to update your forked CPython repository to be the same as the upstream CPython repository.

Solution:

git checkout master
git pull upstream master
git push origin master

As someone who only really learned git (beyond non-trivial purposes) within the last year or so, the devguide’s Git bootcamp is fantastic for some more involved actions that I always seem to forget. I’ve actually referred to it a few times outside of CPython development, and have guided others towards it for learning how to use git when contributing to open source projects.

2 Likes