After some work (and updating the buildbot server to Python3.8), I am happy to announce that is now possible to test a Pull Request with the buildbot fleet on demand by just by adding a label ( test-with-buildbots):
When you add this label to an existing Pull Request:
A build will be immediately triggered in a selected subset of the stable buildbots. Every buildbot will report to the PR the build status (like Travis, Azure Pipelinesā¦etc). This is what you will see:
As long as the label is in the Pull Request every new commit to the pr will be tested with the buildbots in the same way that Travis or Azure Pipelines tests every commit that is pushed to the PR. As requested, the label will be removed once a build is scheduled. If you want to test again, you should add the label once more.
Clicking the āDetailsā link in the GitHub status will bring you to the builder page if you want to read logs or see the build live.
The checks are not required to merge the PR. This means that if you decide so, you can proceed and merge the PR even if a buildbot is failing (for example, if the failure is unrelated to the PR).
In the buildbot page you can distinguish all the builders that are building PRs by a new Pull Request label:
Using the buildbots before merging will be especially useful in the following situations:
You are reviewing a big C PR and you want to make sure that there are no refleaks by using the refleak-detection builders. These builders take more time but will tell you if the PR has some sneaky reference leaks.
You want to make sure that the PR builds on all platforms. We have a big collection of platforms (various versions of Windows, macOS, linux distros) and configurations (debug, non-debug, PGO/LTO, LTO, clang sanitizersā¦) and if you want to make sure that a PR that you suspect may behave differently by platform (like adding new POSIX functions) will not break later, is a good practice to test with the buildbots first.
Because you donāt want us to suffer afterwards if your PR fails on some obscure platform
Notes
Each individual builder has a limited amount of PRs that can be tested simultaneously (depends on the builder) so it may take a while to start building your PR. So please, be patient! You can check the status of all builders in the buildbot page
Take into account that the buildbots are donated machines, so remember to be a good citizen and double-check the code of the PR before adding the label.
This new addition should allow you to gain confidence when merging a PR with minimum impact in your existing workflow and effort. If you have any feedback or feature request, go ahead and tell us
Who is allowed to add this label and trigger builds? PRs are arbitrary unreviewed code from the internet running on the LANs of the buildbots, Iād prefer to limit the ability to trigger such runs to core devs (committers).
You posted this in Committers so Iām assuming the feature is limited to us? If so, great!
Theoretically yes as long as you leave the label in the PR. If you donāt trust the author you can put the label, wait for the confirmation (the buildbots start to build) and then remove the label. If you donāt mind leaving the label because you trust the author you can do it: is up to the core dev.
Notice that this situation is similar to āautomergeā. Someone could push new commits after you added the label and before the required test finishes.
Also, you will receive a notification if the author pushes a new commit. If we find that still a concern, I can force the server to remove the label as soon as it has triggered the first build so you would need to add it every time you want to have a new build of that PR, but this sounds very cumbersome.
Given the scarcity of resources, I think Iād prefer this anyway. We should always make sure regular CI passes before pushing to the buildbots, and the manual step will make that more likely.
My earlier post (from 1 AM) wasnāt ideal, let me try again.
This is awesome to have! Developing is so much better when I can trust the machine to do the boring bits. Thank you very much for your work on the buildbots and the CI in general!
I am worried that the security implications need more thought, though. As @gpshead mentioned, buildbot operators could so far rely on only running code approved by a core dev, and I imagine they manage their machinesā security with that in mind. I think we need to ensure it stays that way, or communicate the change with all the buildbot operators.
Could we have a review from a security expert? Is there, for example, a race condition where a malicious actor could sneak in a commit right after the label is added? Or something else that can go wrong when the label gets used routinely?
Thanks @encukou for your comments! This is of course a reasonable concern. Notice that although I think we should make sure that the buildbots are secure as possible, we have this comment in the devguide already:
We only allow builds to be triggered against commits to the CPython repository on GitHub. This means that the code your buildbot will run will have been vetted by a committer. However, mistakes and bugs happen, as could a compromise, so keep this in mind when siting your buildbot on your network and establishing the security around it. Treat the buildbot like you would any resource that is public facing and might get hacked (use a VM and/or jail/chroot/solaris zone, put it in a DMZ, etc). While the buildbot does not have any ports open for inbound traffic (and is not public facing in that sense), committer mistakes do happen, and security flaws are discovered in both released and unreleased code, so treating the buildbot as if it were fully public facing is a good policy.
The code can be audited at GitHub - python/buildmaster-config: Configuration for buildbot.python.org. The webhook is encrypted and uses a secret between GitHub and the server so it cannot be simulated in a way that third parties can trick the server into believing that a webhook has been triggered.
Regarding race conditions, I would not claim that they are not (in case I am missing something) but the way it works is that when you add the label, it sends and event to the server and the commit is extracted from that event and the label is removed before the build is triggered, so in theory there is no window for pushing something in the middle. Also only adding the label will trigger the event so even if it fails to remove the label afterwards this wonāt trigger more builds. The code being tested is always the code at the exact moment a core Dev adds the label.
I hope this apports some insight into how it works and the security considerations. Thanks for commenting about this, as discussing these matters is very useful so we can know that we are not missing anything and everyone is comfortable with the solution
Thatās a great enhancement to help to reduce regressions on buildbots (after a PR is merged)! For example, it will finally allow to test on FreeBSD before merging! Thatās really useful to system and network programming which is usually platform dependent.
I agree that launching tests on buildbots should be a one shot action. If the PR is updated, the buildbots should not run the updated PR. Itās to avoid abusing buildbot resources, but also for security reasons.
Thanks Pablo for implementing this cool feature!
The Refleak buildbot workers usually take between 2 and 3 hours for one build. Are they always run? If this label starts to be commonly used, maybe we need different flavors of buildbots:
Quick sanity checks (fastest): smallest set of buildbot workers
Complete checks (slowest): include slowest workers, especially Refleaks
Iām not sure about the need of a 3rd category
Anyway, we can decide that later, when we will have more experience with this new tool.
No need for a correction. You can see what permissions the various levels have and for changing labels you need triage or higher. IOW even PR authors canāt add labels to their own PRs so as long as you trust the triagers and higher we are fine.