How do you handle multiple CPython builds?

csabella · April 12, 2019, 12:23pm

I was asked in an email how to work on various GitHub release branches at the same time. For example, working with the master branch and a 3.6 branch. I have read snippets in other messages about how some individuals manage this, but I thought it might be helpful ask in a central location to establish best practices. Hopefully this will generate some dialogue and we’ll all learn something.

Please note that I’m not asking about IDEs, editors, git workflow in general, etc. I’m just asking how you run multiple versions of Python (from source) simultaneously. If you want to talk about how you also manage multiple downloaded versions (or pre-installed versions) in addition to builds, then I think that can be helpful as well. OS might be helpful in this discussion.

Thanks!

zware · April 12, 2019, 1:56pm

I like to use git worktree for this (for anyone familiar with hg share, it’s basically the same thing). It essentially allows several checkouts of the same repository to share the object store, but no two worktrees are allowed to have the same branch checked out at the same time (though the same commit can be checked out under different branch names). In my view, this is the gold standard for running multiple versions from source at once, and is completely platform independent; I do the same on macOS, Linux, and Windows.

For example, I tend to check out my code from cd ~/code/github.com/python/cpython; the initial clone is in ./master (git clone https://github.com/python/cpython.git master), then from cd ./master I can run for branch in 3.7 3.6 2.7; do git worktree add ../$branch; done (though I would probably just do each individually as I need it).

On the other hand, there’s not nearly as much need to keep individual checkouts for each branch anymore, unlike in our old hg workflow. Now it’s simple enough to run make distclean (or git clean -fdx if there’s nothing untracked that you want to keep), check out the branch you need, and rebuild afresh on the occasions when a non-master build is needed. This doesn’t help with actually running two versions simultaneously if that’s what you need, though.

njs · April 12, 2019, 6:41pm

Many people don’t realize that python’s build system on Unix allows seperating the build directory from the source directory. I always do builds in a separate directory, both for general cleanliness (keeps source files and build artifacts separate; makes it harder to accidentally mix artifacts from different builds), and because it lets you have multiple independent builds simultaneously.

For example:

$ mkdir my-build-dir
$ cd my-build-dir
$ ../configure <options here>
$ make

The trick is that whatever directory you’re in when you run configure becomes the build dir, which is where you run make and where all the build artifacts end up. This doesn’t have to be the source directory; it can be any directory.

(This actually works for most projects that use the classic configure; make pattern, not just python.)

The build dir does refer back to the source tree when building; this doesn’t give you independent copies of the source files The ideal case is when you have multiple builds with different configure settings (e.g. different compiler flags). If you want to use this to work on multiple branches simultaneously, it can be a bit awkward – you have to make sure you only run make in your py36 build dir when you have the py36 branch checked out in the source tree, but there’s nothing that enforces this. @zware’s solution is probably less error-prone for this case. But it can work, and you can also combine the two approaches.

nanjekyejoannah · April 12, 2019, 11:00pm

Thanks @zware and @njs that was very helpful. Thanks @csabella for the bail out.

gpshead · April 14, 2019, 3:52am

I haven’t really gone beyond the multiple build dirs approach myself. I regularly wipe out build dirs if I’m unaware of their state as compile times are so fast today. The git worktree sounds intriguing, will try! Thanks for the idea.

njs · April 14, 2019, 5:07am

Be careful, or the next patch you write will turn out to need before/after benchmarks with PGO enabled…

csabella · April 14, 2019, 11:17am

Thanks @zware and @njs for the great info!

matrixise · April 15, 2019, 10:16pm

Hi Cheryl,

Sorry, I was on holiday the last week and was “offline”

Here is my workflow and I would like to improve it with scripts.

Download the source

$ git clone https://github.com/python/cpython python/cpython

Define the remote locations

$ cd python/cpython
$ git remote rename origin upstream
$ git remote add origin git@github.com:matrixise/cpython.git

Use worktree when for a specific branch or PR

for a specific branch

$ git worktree ../cpython-3.7 upstream/3.7

for a PR

I have this alias

wpr = !bash -c “git fetch upstream pull/{1}/head:pr_{1} && git worktree add …/(basename (git rev-parse --show-toplevel))-pr-{1} pr_{1}” -

$ git wpr 12848    # a PR on github
$ cd ../cpython-pr-12848

In this directory, I use my compile.fish script

#!/usr/bin/env fish

set number_of_cpu (python3 -c "import os; print(os.sysconf('SC_NPROCESSORS_ONLN'))")

./configure --prefix=$PWD-build --with-pydebug --silent -C
make CC="/usr/bin/ccache gcc" --jobs $number_of_cpu --silent

The result will go into $PWD-build -> eval $PWD-build/bin/python
(because I use fish as shell)

./configure -C will use the cache of autoconf
I use ccache for the compilation, really useful just after a make distclean

I think to document that because these commands could be useful for the
new contributors.

Cheers,

Stéphane

vstinner · April 16, 2019, 3:04pm

I like to one a different directory per Python version to benefit of incremental compilation (make doesn’t have to rebuild everything) when I switch to a different brank. As Zachary, I use git worktree:

vstinner@apu$ git worktree list
/home/vstinner/prog/python/master  2f5b44879f [master]
/home/vstinner/prog/python/2.7     44a2c4aaf2 [2.7]
/home/vstinner/prog/python/3.4     e76cbc7810 [3.4]
/home/vstinner/prog/python/3.5     2bb327816d [3.5]
/home/vstinner/prog/python/3.6     4508bc37dd [3.6]
/home/vstinner/prog/python/3.7     cd46b09b08 [3.7]

I have a prog/python/update.sh script to update my checkout and move back to the “right” branch (for example, if I was in a local branch in master/, it goes back to master branch):

$ cat update.sh 
set -x -e

cd ~/prog/python/master
git fetch upstream --tags --prune
git fetch origin --prune

for branch in master 3.7 3.6 2.7; do
    cd ~/prog/python/$branch
    git checkout $branch
    git merge --ff
done

I also have a prog/python/all_make.sh script to recompile all Pythons:

set -e -x
(cd master && make) || exit $?
for version in 3.7 3.6 2.7; do
    (cd $version && cp -f Modules/Setup.dist Modules/Setup && make) || exit $?
done

Example to manually cherry-pick a change into 3.7:

$ cd 3.7
$ git checkout -b myfix37
Switched to a new branch 'myfix37'
$ git cherry-pick -x SHA1
$ ...
$ make && ./python -m test ...
$ gh_pr.sh

gh_pr.sh is another hackish script to create a PR for Python:

set -e -x
if [[ "$1" = "-f" ]]; then
    force=-f
else
    force=
fi
local_branch=$(git name-rev --name-only HEAD)

project="$(basename $PWD)"
ref_branch=master

case "$project" in
    [23].[0-9])
        ref_branch=$project
        project=cpython
        ;;
    master)
        ref_branch=$project
        project=cpython
        ;;
esac

echo "branches: $local_branch -> $ref_branch"

git push origin HEAD $force
URL="https://github.com/python/$project/compare/$ref_branch...vstinner:$local_branch?expand=1"
python3 -m webbrowser $URL

gh_pr.sh requires that the Git “origin” remote is my fork:

$ git remote -v
origin	git@github.com:vstinner/cpython.git (fetch)
origin	git@github.com:vstinner/cpython.git (push)
upstream	git@github.com:python/cpython.git (fetch)
upstream	git@github.com:python/cpython.git (push)

vstinner · April 16, 2019, 3:05pm

I used that in the past, but when I switch between Python branches, it changes the modification time and so requires to re-compile some files just because of that. I don’t have this issue with git worktree. Well, use whatever works for you, I don’t think that one workflow is better than another

Seluj78 · April 16, 2019, 3:20pm

You’ll be happy to know @vstinner that a Python tool now exists to create a PR from your command line ! Meet PyHub_PR

vstinner · April 16, 2019, 3:21pm

It took me time to design my workflow. I’m not interested to change it: it works perfectly well and it’s efficient

Seluj78 · April 16, 2019, 3:26pm

Haha fair enough, I just wanted to mention it because I wrote it and thought it could be usefull for you

matrixise · April 16, 2019, 4:37pm

you also have https://github.com/Mergifyio/git-pull-request

njs · April 16, 2019, 5:40pm

It’s not implemented in Python, but github’s own hub command includes a git pull-request command, as well as many other conveniences.