I think it would be useful to either configure warnings as errors in the test suite (and then capture unraisable/threadexcept errors using a mechanism like pytest has) or tee any warnings produced, and fail CI if any have been produced that were not captured with catch_warnings or assertWarns etc like https://pypi.org/project/pytest-max-warnings/
In the past, there were long time periods when you could not run the Python tests with -Werror because deprecation warnings emitted by third-party code – usually in some vendored code in pip. It takes a time to fix warning in some third-party module, then update the copy in pip, then release a new pip version. And the link of dependencies can be longer – pip directly depends on some module which depends on other module which emits a warning.
I think this is the main reason why Python tests are not normally run with zero warning tolerance. But we should time from time manually run tests with -Werror and fix errors which we can fix.
It’s not too bad to add an exemption on a case by case basis for these warnings from third party code. Also this code is manually updated and so the exemptions can be added in the PR that introduces the warnings
I would prefer to have an optional CI to report if tests log warnings, rather than blocking the whole workflow. As Serhiy wrote, warnings can come from 3rd party code.