@pablogsal see Ronald’s reply earlier where he verified that the symbols were there. Perhaps we are talking about different things.
Based on CAM’s message it looks to me that the OP is unhappy that GitHub (not us) has changed the container images used by GitHub Actions to contain a Python installer without symbols. Sure, we might help them by not stripping symbols in our official installers, but why would we incur the extra cost on all our users? I can also understand that GitHub prefers to use official installers where they exist.
So how can we help Austin’s CI jobs recover the symbols?
In case it wasn’t clear (it wasn’t to me until I clicked on their link), the OP is Austin’s author. (Nice job BTW!)
Even if we don’t change anything I still would like to understand in what part of the process the symbols are lost. Also @ronaldoussoren reports that when he installs python the symbols are there for him, which is another mystery.
I checked the framework itself earlier (
nm -gU /Library/Frameworks/Python.framework/Versions/3.11/Python) and that does contain a full set of symbols. Both the stub executable and Python.app/…/Python are small binaries linked to the framework library.
What symbols do you expect to see in these executables?
Gabriele, can you help us understand what you meant there?
Sure. I’ll try to add more context to clarify the situation, but I’m afraid this is going to be a bit of a read.
Austin is a frame stack sampling tool for Python that works by resolving some of the exported symbols from the Python binaries by reading the interpreter’s remote memory space. In particular, in later releases, Austin relies on the
(_)_PyRuntime symbol to get hold of the interpreter state, and from there loop over the thread states and unwind the frame stack for each of them.
If symbols are not available, Austin tries to find the interpreter state from a BSS scan. If that fails too, on some occasions Austin might try to scan (a sensible portion of) the heap, as a final desperate attempt before giving up.
The way Austin finds out which is the “interesting” binary is based on a few heuristics, which don’t make this discovery step super reliable. This is to cope with situations where the Python binary might be embedded in an executable that has no [Pp]ython in its path. Things we look for are the file size (an interesting Python binary is a few MB at least) and then symbols. There is always the chance that Austin picks the wrong binary, making all the efforts of looking up the interpreter state pointless. I haven’t noticed this behaviour with any of the most popular Python distributions though.
Lately I’ve been working on adding Python 3.11 support to Austin. During the early development stages, while the beta releases were out, I was able to successfully test Austin on MacOS using the setup-python action on GitHub. At that stage, the action was providing custom builds of Python. However, with the rc2 release, the setup-action maintainers decided to move to the official installer (plus a script to automate the installation process on the GH workflow runners). I discovered this because the CI for Python 3.11 on MacOS started failing with permission issues. I have opened this issue with setup-python, and the maintainers confirmed that they had moved to the official installer. Thanks to this comment by @ronaldoussoren on a previous discussion, I was able to make the CI job work again by removing the signature from the binaries. Evidence of that is in the CI job for the commit used to produce the latest 3.4.1 release of Austin. There you can also see the extra step for removing the signature from the binaries. I have also done some manual testing on my machine too and Austin seemed to work just fine.
A few days ago, while doing some more work on Austin, I discovered that the CI jobs for Python 3.11 and 3.10 started failing badly. Such big failures are generally an indication that either support for that particular Python version is broken, or something has changed in the binaries and Austin can’t handle that. In the former case, I’d expect jobs to fail across platforms, which was not the case. So I started investigating with Python 3.11 from the official installer on my machine and discovered that Austin was picking the wrong binary. This prompted me to check for symbols in the binaries, as failing to find those would cause Austin to potentially pick the wrong binary.
Now I’ve started using the official installer because of the need to investigate these issues. I’m more familiar with the Pythons installed via pyenv, from which I expect something like
❯ nm -gU `python3.9 -c "import sys; print(sys.executable)"` | grep "_PyRuntime$" 00000001002e6780 S __PyRuntime
The same command for Python 3.11 from the official installer returned nothing. This is “fine”, as we know we should also look for any potential shared library. For some reasons I was convinced that
otool -L /Library/Frameworks/Python.framework/Versions/3.11/bin/python3
had given me
/Library/Frameworks/Python.framework/Versions/3.11/Resources/Python.app/Contents/MacOS/Python in the past, but I have just double-checked, and I can see that it actually links to
/Library/Frameworks/Python.framework/Versions/3.11/Python. Indeed, as reported by @ronaldoussoren, I can also confirm that the symbols are there. Furthermore, it seems that the behaviour of the installer hasn’t changed, contrary to what my initial investigation led me to conclude (whence this discussion), as confirmed by @guido.
To summarise, we now know that there doesn’t seem to be changes in the MacOS installers when it comes to symbols. Is the CI failure an issue with Austin then? I re-downloaded the installer used in the CI job that was used to verify (successfully) the 3.4.1 release commit, used it to re-install Python 3.11 on my machine, checked out the v3.4.1 tag, built Austin from it, removed the signatures from the Python binaries, and I can reproduce the current CI failure!
At this point, my next step for me is to understand why, all of a sudden, Austin is failing to find the shared library, that it probably used to find before, with the same exact target binaries. I hope this also shows why my thought after my initial investigation was that something had changed in the installer, whence this messy discussion.
I hope this is, as usual, something super-silly that I’ve overlooked, rather than a peculiar MacOS time-bomb that I don’t know of, because then I can hope to find the problem and fix it! If anybody has any thoughts they’d be more than welcome, and I hope this all clarifies the situation now.
Increasing the timeout before Austin gives up trying to find a binary with symbols seems to “cure” this issue, at least locally. It still remains a mystery to me why all of a sudden there is the need to wait a bit longer for the binary maps to be laid out in memory. The CI job went from “reliably passing” to “reliably failing” with no changes from Austin. This is a rerun of a job that passed, and that is now failing. The only thing that could have changed here is the result of the setup-python action.
Thanks for the explanation! All I can do is wish you luck in understanding the timing issue.
Ah, I think I now know what’s happening! The CI started failing when the macos-latest runners have been updated from macos-11 to macos-12. I suspect this means that Austin is now being compiled with a more recent version of gcc which makes Austin slightly faster as a result of the O3 optimisations. This means that Austin gives up looking for the right binary slightly earlier (the loop is based on number of iterations rather than a timer). I think this might also/alternatively mean that 3.10 and 3.11 take slightly longer than older versions to initialise on MacOS 12.