Now you can test a PR with the buildbots before merging!

Testing PRs with buildbots :robot:

After some work (and updating the buildbot server to Python3.8), I am happy to announce that is now possible to test a Pull Request with the buildbot fleet on demand by just by adding a label (:hammer: test-with-buildbots):


When you add this label to an existing Pull Request:

  • A build will be immediately triggered in a selected subset of the stable buildbots. Every buildbot will report to the PR the build status (like Travis, Azure Pipelines…etc). This is what you will see:

  • As long as the label is in the Pull Request every new commit to the pr will be tested with the buildbots in the same way that Travis or Azure Pipelines tests every commit that is pushed to the PR. As requested, the label will be removed once a build is scheduled. If you want to test again, you should add the label once more.

  • Clicking the “Details” link in the GitHub status will bring you to the builder page if you want to read logs or see the build live.

  • The checks are not required to merge the PR. This means that if you decide so, you can proceed and merge the PR even if a buildbot is failing (for example, if the failure is unrelated to the PR).

In the buildbot page you can distinguish all the builders that are building PRs by a new Pull Request label:

When to use the new label :man_student:

Using the buildbots before merging will be especially useful in the following situations:

  • You are reviewing a big C PR and you want to make sure that there are no refleaks by using the refleak-detection builders. These builders take more time but will tell you if the PR has some sneaky reference leaks. :medal_sports:

  • You want to make sure that the PR builds on all platforms. We have a big collection of platforms (various versions of Windows, macOS, linux distros) and configurations (debug, non-debug, PGO/LTO, LTO, clang sanitizers…) and if you want to make sure that a PR that you suspect may behave differently by platform (like adding new POSIX functions) will not break later, is a good practice to test with the buildbots first.

  • Because you don’t want us to suffer afterwards if your PR fails on some obscure platform :wink:


  • Each individual builder has a limited amount of PRs that can be tested simultaneously (depends on the builder) so it may take a while to start building your PR. So please, be patient! You can check the status of all builders in the buildbot page :mantelpiece_clock:

  • Take into account that the buildbots are donated machines, so remember to be a good citizen and double-check the code of the PR before adding the label.

This new addition should allow you to gain confidence when merging a PR with minimum impact in your existing workflow and effort. If you have any feedback or feature request, go ahead and tell us :slight_smile:


Hmm… the label can only be added by core developers, right?

Who is allowed to add this label and trigger builds? PRs are arbitrary unreviewed code from the internet running on the LANs of the buildbots, I’d prefer to limit the ability to trigger such runs to core devs (committers).

You posted this in Committers so I’m assuming the feature is limited to us? If so, great!

It can be added by core-devs and the Cpython GitHub triage team (the same thing as the automerge label). @brettcannon may correct me if I am wrong.

Unless I am missing something, only core-devs and the triage team (the contributors we have decided to allow add labels - same as with automerge).

If we want more strict permissions we can implement them in the future.


This sounds like once the label is added, the PR author can make any code run on the bots.

Theoretically yes as long as you leave the label in the PR. If you don’t trust the author you can put the label, wait for the confirmation (the buildbots start to build) and then remove the label. If you don’t mind leaving the label because you trust the author you can do it: is up to the core dev.

Notice that this situation is similar to ‘automerge’. Someone could push new commits after you added the label and before the required test finishes.

Also, you will receive a notification if the author pushes a new commit. If we find that still a concern, I can force the server to remove the label as soon as it has triggered the first build so you would need to add it every time you want to have a new build of that PR, but this sounds very cumbersome.

Given the scarcity of resources, I think I’d prefer this anyway. We should always make sure regular CI passes before pushing to the buildbots, and the manual step will make that more likely.

1 Like

Fair enough :+1: . I will update the server soon then to make the step manual (the label is removed once the builds are scheduled).

My earlier post (from 1 AM) wasn’t ideal, let me try again.

This is awesome to have! Developing is so much better when I can trust the machine to do the boring bits. Thank you very much for your work on the buildbots and the CI in general!

I am worried that the security implications need more thought, though. As @gpshead mentioned, buildbot operators could so far rely on only running code approved by a core dev, and I imagine they manage their machines’ security with that in mind. I think we need to ensure it stays that way, or communicate the change with all the buildbot operators.

Could we have a review from a security expert? Is there, for example, a race condition where a malicious actor could sneak in a commit right after the label is added? Or something else that can go wrong when the label gets used routinely?


Thanks @encukou for your comments! This is of course a reasonable concern. Notice that although I think we should make sure that the buildbots are secure as possible, we have this comment in the devguide already:

We only allow builds to be triggered against commits to the CPython repository on GitHub. This means that the code your buildbot will run will have been vetted by a committer. However, mistakes and bugs happen, as could a compromise, so keep this in mind when siting your buildbot on your network and establishing the security around it. Treat the buildbot like you would any resource that is public facing and might get hacked (use a VM and/or jail/chroot/solaris zone, put it in a DMZ, etc). While the buildbot does not have any ports open for inbound traffic (and is not public facing in that sense), committer mistakes do happen, and security flaws are discovered in both released and unreleased code, so treating the buildbot as if it were fully public facing is a good policy.

The code can be audited at The webhook is encrypted and uses a secret between GitHub and the server so it cannot be simulated in a way that third parties can trick the server into believing that a webhook has been triggered.

Regarding race conditions, I would not claim that they are not (in case I am missing something) but the way it works is that when you add the label, it sends and event to the server and the commit is extracted from that event and the label is removed before the build is triggered, so in theory there is no window for pushing something in the middle. Also only adding the label will trigger the event so even if it fails to remove the label afterwards this won’t trigger more builds. The code being tested is always the code at the exact moment a core Dev adds the label.

I hope this apports some insight into how it works and the security considerations. Thanks for commenting about this, as discussing these matters is very useful so we can know that we are not missing anything and everyone is comfortable with the solution :slight_smile:


That’s a great enhancement to help to reduce regressions on buildbots (after a PR is merged)! For example, it will finally allow to test on FreeBSD before merging! That’s really useful to system and network programming which is usually platform dependent.

I agree that launching tests on buildbots should be a one shot action. If the PR is updated, the buildbots should not run the updated PR. It’s to avoid abusing buildbot resources, but also for security reasons.

Thanks Pablo for implementing this cool feature!

The Refleak buildbot workers usually take between 2 and 3 hours for one build. Are they always run? If this label starts to be commonly used, maybe we need different flavors of buildbots:

  • Quick sanity checks (fastest): smallest set of buildbot workers
  • Complete checks (slowest): include slowest workers, especially Refleaks
  • I’m not sure about the need of a 3rd category

Anyway, we can decide that later, when we will have more experience with this new tool.


This is a much needed feature, thank you very much! I was planning to implement it myself at some point, but I’m glad you beat me to it.

1 Like

Awesome new feature @pablogsal thank you! Is this documented in the devguide?

1 Like

Thanks! Is not yet documented as I want to wait some time until people can play with the feature before making a guide :slight_smile:

1 Like

No need for a correction. :wink: You can see what permissions the various levels have and for changing labels you need triage or higher. IOW even PR authors can’t add labels to their own PRs so as long as you trust the triagers and higher we are fine.