FreeBSD gets a new "Cirrus-CI" GitHub Action job and a new buildbot

Hi,

Kubilay Kocak was super kind to provide us two FreeBSD buildbot workers for many years: FreeBSD STABLE and CURRENT. Last week, he removed his workers. Thanks Kubilay!

I contacted some FreeBSD developers around and things are moving on quickly! test_gdb and test_tempfile got fixed on FreeBSD. By the way, macOS should benefit of the test_gdb fix, the test was always skipped on macOS when Python was built with clang. test_tarfile has pending fix.

Zachary Ware (@zware) just added a new FreeBSD 13.2 buildbot worker: FreeBSD 13.2-RELEASE amd64 4 vCPU VM, 4GB RAM. The last failing test is test_tarfile (again, it has a pending fix).

Ed Maste just added a new Cirrus-CI job to GiHub Actions: FreeBSD 13.2 amd64 on the Python main branch. The job is “non voting” (not mandatory). See the .cirrus.yml configuration file. This platform runs on Google Cloud Platform (GCP). The surprising part is that tests are run as root (uid 0, gid 0)! It allowed Ed to work around an issue of the FreeBSD GCP image: TCP blackhole is enabled. The job just runs sysctl net.inet.tcp.blackhole=0, command which requires root!

By the way, thanks Ee Durbin for enabling (quickly!) the Cirrus-UI platform in the GitHub cpython project!

For now, only the stable FreeBSD version 13.2 is tested and only on the Python main branch. There are on-going discussions to add FreeBSD CURRENT (FreeBSD 14 and/or FreeBSD 15) buildbot workers. Later, I would also be interested to support “old stable” versions like FreeBSD 12 (EOL at the end of the year). But only if it doesn’t give me too much maintenance work :slight_smile:

Depending on the progress, I may consider enabling the Cirrus CI on Python 3.11 and 3.12 branches. The buildbot worker is already running on these branches.

By the way, if a core developer is interested to help me fixing FreeBSD issues specific, contact me! I would be interesting to consider promoting FreeBSD as a Tier-2 platform!

Thanks to everybody who helped me to make all of this possible :wink:

If you have any issues with FreeBSD, you can contact me (or just reply below).

Victor

6 Likes

Just running against -RELEASE (even -STABLE) isn’t good enough. While -CURRENT churn hasn’t affected CPython too much, there is always an elevated risk that some API, ABI, compiler, etc changes that breaks third-party software don’t get noticed until “too late” because less of us test and dogfood on -CURRENT. The goal should always be -CURRENT and supported -STABLEs and -RELEASEs.

As for 3.12, 3.11 and other branches, such discussions also need to touch upon which (series of) commits constitute new features or fixes. Speaking as one of the Python ports team, there are certain commits and patches that we as a (in PEP terms) distro should not (even be thinking about) carrying, they should be part of (point) releases.

Python has limited human resources. Testing more platforms require more eyes and hands to look into these issues. So yeah, it would be nice, but please also understand that it’s a best effort support :wink:

5 Likes

Thanks for pushing this forward, Victor!

1 Like

Same with FreeBSD. Even less of us on the FreeBSD side have any sort of support, particularly infrastructure or monetary, to help development and testing along. I’m sure this is at least partially why koobs@ stepped away.

Those are not “more platforms”, and especially not when physical architectures never entered this discussion.

Furthermore, some of us on the FreeBSD side like myself develop against and actually run -CURRENT as our primary environment, partially so we can catch even the most minor stuff. Many items that land in -CURRENT cherry-pick to (existing) -STABLE and eventually the next -RELEASE; by the time a problem is found during the latter two stages it is far more difficult and sanity-consuming to trace and fix.

If you would like to provide stable buildbot workers, we’d be happy to have them :slight_smile:

7 Likes

I wasn’t expecting a tier 3 platform to be in our CI. Was there a discussion about putting tier 3 platforms in our CI as a status check? The reason I bring it up is I at least always check why a status check fails, required or not. So to me, the check isn’t free even if it isn’t required. But if there was a discussion I missed then no worries!

The explanation of -CURRENT says, “users are expected to have a high degree of technical skill”. We are not FreeBSD users, so asking us to run in-development code of FreeBSD is asking a lot as we are then potentially debugging FreeBSD instead of Python itself. As Zach pointed out, a stable buildbot for -CURRENT would be the best way to help notice issues for Python in -CURRENT than directly in our CI where people implicitly check any failing status check.

6 Likes

I agree with this. I don’t think we should have any non-mandatory CI jobs running on GHA, unless there is a way to also exclude them from the “Some checks were not successful” red status, which also shows up as a red X instead of a green checkmark at various places in the UI. I rely on that red X vs green checkmark to quickly tell whether a PR is in good shape or needs work; it seems like the freebsd GHA is currently failing and will cause all PRs to show as “some checks failing” – see e.g. gh-108732: include comprehension locals in frame.f_locals by carljm · Pull Request #109026 · python/cpython · GitHub

Basically: I think it is important that all CI jobs that run on every PR are kept passing on main, and that effectively means all GHA CI jobs are mandatory to fix, regardless of whether they are technically required for merge.

4 Likes

The discussion happened in the issue and the PR . I checked that the job was non-voting and so made the assumption that it’s fine to add a non-voting CI.

I agree, it’s annoying me as well. FreeBSD and Windows are both affected by test_concurrent_futures sickness: test_crash_big_data() and test_interpreter_shutdown() random failures. It would be good if someone can look into these issues :slight_smile: Well, many buildbots are also affected by this sickness.

I don’t know if GitHub gives a fine control on how it summarizes all CI jobs as “success” or “failure”.

By the way, I have the same problem with Windows x86 and Azure Pipelines which are more and more unstable these days, whereas they are non-voting.

Is Windows 32-bit still supported nowadays? PEP 11 says that it’s a Tier-1 platforms and so: “All core developers are responsible to keep main, and thus these platforms, working.”

This is not entirely accurate. The documentation is phrased as such due to the audience being highly varied in skill and confidence levels and familiarity with the rolling release concept, so is more of “you accept the challenge of learning as you go”. In reality, -CURRENT is meant to be usable as a rolling release in the appropriate environments (this is one of them!), not dissimilar to Linux distributions that operate this way. (-STABLE is also a rolling release, but the base system API/ABI is kept as stable as possible)

I agree that CI isn’t the best way, entirely because you need a pull request or commit on the CPython side to trigger a build. I haven’t looked into CirrusCI enough to see what they do for -CURRENT, but it’s not that relevant anyway.

As for a buildbot, in theory I could set one/some up as jail(s) on my “HP air fryer”, but it wouldn’t be public or that automated, not to mention not a good use of my personal resources. Without some kinds of external support, which judging by this and related threads the general tone reads like “go pound sand”, I could pull the plug at any time for any or no reason just like koobs@ did, and the community would be left with nothing, not even recourse. Thus, the issue isn’t that much about who stands a new buildbot up, but how to keep them up in a sustainable manner that benefits both CPython and FreeBSD. I cannot emphasise the sustainability bit enough, not least because I am precipitously being driven towards burnout myself, and these types of discussions aren’t helping.

I don’t think I suggested youse running -CURRENT as users, but rather the build and testing infrastructure prioritise -CURRENT as a target (whilst keeping -RELEASEs at least) for easier and more immediate triage and tracing of problems as they happen. Neither does the suggestion exist that youse as CPython have to go it alone. Whatever comes of the buildbot or similar would have some alerting facility.

Hi,

What is needed to provide a buildbot worker from the FreeBSD side? Install devel/py-buildbot-worker and go with New Buildbot Workers (python.org)? What are the requirements about this buildbot worker in terms of availability? I could easily provide a worker in a FreeBSD jail on a 16 CPU Xeon with 64 GB RAM. It runs FreeBSD current, but occasionally reboots or is under full load. How often will the buildbot receive work?

Are you (whoever is interested in setting up a FreeBSD system) aware of the free offer from Oracle to get one 2 CPU 24 GB RAM 200 GB disk arm-VM in their cloud? Would a FreeBSD-arm system running current be interesting as a python-buildbot-worker?

Bye,
Alexander.

FreeBSD is often good at uncovering concurrency issues that don’t occur elsewhere, due to different implementations and timings.

From the FreeBSD side if we’re given the choice of excluding FreeBSD from CI or skipping this test on FreeBSD I would much prefer the latter. Although that would just mask the problem in my opinion identifying new regressions is much more valuable.

That said some of these issues seem to be longstanding sources of instability in test results across many platforms, and if FreeBSD is able to reproduce them more reliably that should be a good thing.

1 Like

@carljm was affected by test_asyncio failure which seems to be specific to FreeBSD for some reasons:
test_asyncio: test_start_tls_server_1() fails randomly on the Cirrus CI FreeBSD job.

Sadly, test_concurrent_futures has a failure rate of at least 1/3 even or even 50% on Windows CI (on GitHub Action) these days :frowning: (I hope that I’m pessimistic and it’s lower.) There are multiple test_concurrent_futures issues, so it’s even more likely that a CI fails because one of these test_concurrent_futures issues. It’s just statistics.

That’s why I’m actively tracking tests failing randomly and attempting to fix them one by one. By the way, I need your help, see my cpython issues that I reported, many of them are unstable tests :slight_smile:

Nope. I think the general approach for having required vs not is so you know you can take the risk to merge even with a failure going, but otherwise you should avoid merging unless all of CI is passing.

Do you want to start a separate conversation about dropping Azure Pipelines from CI?

I believe the building of it is still supported for older Python versions, but I don’t think Python 3.12 is going to have 32-bit installers released. Did you want to start a conversation about dropping 32-bit Windows from being tier 1 for CPython going forward?

Yep. That’s why we have platform support tiers and FreeBSD is currently tier 3 mostly thanks to Victor’s hard work.

Sure, but we still have to understand the failures.

Noticing the failure is part of it. The other is understanding the failures (e.g., is it FreeBSD or CPython’s fault), and then how to fix it.

Yes.

To be considered stable, roughly 24/7. This isn’t to say stuff going down for a day or something on occasion is the end of the world for a tier 3 platform, but in general the machine is expected to be up.

Off-and-on throughout the day. Any time a PR lands on any maintained branch plus explicit requests on PRs to test on specific buildbots. For instance, the current FreeBSD with the list of builds over the last couple of days: Buildbot

Not sure who “you” is in this case, but I’m personally not (but then again I don’t have the bandwidth to maintain another buildbot as I’m busy setting one up for WASI).

It sounds like Charlie is. More buildbots with different configurations also doesn’t hurt, it’s just a question of who pays attention to them to file issues and help fix bugs the buildbot finds.

That’s not the question. It’s more of a policy thing as no one has added a tier 3 platform to CI before, so as a team it has not been discussed if we want that (see PEP 11 and what a tier 3 platform means in terms of support, as it suggests the buildbot could be failing forever).

I realize that; I’m just trying to provide some feedback or insight that might help inform the policy decision to come.

I see that Tier 3 includes " * Must have a reliable buildbot." so I assume this is mean nobody has added a tier 3 platform to tests run on pull requests (not more general testing).

I can imagine that discussion going from “FreeBSD CI results are unreliable and it causes a lot of extra load to investigate if there is an actual issue or not” to “Tier-3 systems are not permitted in [pull request tests].”

1 Like

Correct. Everything in CI right now are tier 1 platforms (which are platforms every core dev is responsible for and not supposed to break main for).

It’s not even that. It’s simply a question of do we want to have tiers show up as failures ever in CI? Even if FreeBSD had no failures ever I would still be bringing this up as it’s a question of the intent of CI and the status checks. Are non-required/“optional” status checks okay to have in general, or only in exceptional cases? And since tier 2 and 3 don’t block merging by policy, this is causing a bigger question of how we want to treat/interpret our CI results.

3 Likes

Please remember to port the tests that only run on Azure Pipelines into GitHub before dropping them.

This is news to me, but I don’t control which links appear on python.org, so maybe there’s been a conversation about hiding them?

Nope, I thought you had mentioned that due to Windows 8 reaching EOL.

At one point that was probably the major “blocker”, but since then there have been bug reports (and backstories) for people hosting CPython in 32-bit processes, so that’s another group who need 32-bit builds. Even if we were to stop publishing our own binaries, I wouldn’t want to remove the CI jobs - they’re probably the only ones actually checking that we handle data types safely and don’t just assume 64-bit everywhere.

2 Likes

What about the embeddable package? I’m using that to embed Python in a Windows application that only comes in a 32-bit version.

1 Like