Python3.11 seems to break venv created from non-system python with zipped stdlib

We have a python built from source with a zipped stdlib in our project. The directory layout looks like this:

/path/to/our/python
    /bin
        /python
    /lib
        /python310.zip

Then we will create venv from this python to use.
After we upgraded to python3.11, we found the venv created from this python is broken. The reason is that it failed to find the zipped stdlib. It’s not in the pythonpath.

After some investigation, we found this line in Modules/getpath.py causes the problem.

650     elif build_prefix or venv_prefix:                                           
651         # QUIRK: POSIX uses the default prefix when in the build directory      
652         # or a venv                                                             
653         pythonpath.append(joinpath(PREFIX, ZIP_LANDMARK))

Here, if the python is in a venv, the default prefix is used, even if the joinpath(prefix, ZIP_LANDMARK) exists.

I think a possible fix is to test isfile(joinpath(prefix, ZIP_LANDMARK)) first. If true, we add it to the pythonpath, otherwise we use the default prefix.

If this behavior is not intended and the above fix is OK, I’d like to create a PR.

1 Like

What does the git history on those lines say about the reasoning for them?

Nothing. You need to pull up the old getpath.c file for the reasoning that led to this comment.

Looking at the 3.10 implementation again, it seems that search_for_prefix is always going to fail to locate the venv itself, and will only find the global install of Python if it happens to be in the same subtree. (I spent longer mapping this out the first time, so I may be missing something right now - it’s also late and I’m tired…)

But since it worked before, it’s likely that the logic was untangled incorrectly here. It’s marked as a quirk because it seems like a bad idea, but represents what the old code did - the idea is to make deliberate changes over time with proper deprecation to remove the quirks.

So I’d say, either demonstrate exactly what the old behaviour was and replicate it correctly here (i.e. a bugfix), or propose a change to the resolution behaviour (feature). Incidentally, I’ve already merged a change to the Windows behaviour for zip files to stop searching, which was incorrectly translated from the old module that assumed a single location (security).

1 Like

I can spend some time to find out why the 3.10 implementation works for us first.

So in python3.10, first the argv0_path is correctly calculated in calculate_argv0_path for our case.

Then, in search_for_prefix the prefix is calculated from argv0_path(In our case the lib/python3.10/os.py is deliberately kept outside from the zip file as a landmark):

   516     /* Search from argv0_path, until root is found */
   517     status = copy_absolute(prefix, calculate->argv0_path, prefix_len);
   518     if (_PyStatus_EXCEPTION(status)) {
   519         return status;
   520     }
   521
   522     do {
   523         /* Path: <argv0_path or substring> / <lib_python> / LANDMARK */
   524         size_t n = wcslen(prefix);
   525         status = joinpath(prefix, calculate->lib_python, prefix_len);
   526         if (_PyStatus_EXCEPTION(status)) {
   527             return status;
   528         }
   529
   530         int module;
   531         status = ismodule(prefix, &module);
   532         if (_PyStatus_EXCEPTION(status)) {
   533             return status;
   534         }
   535         if (module) {
   536             *found = 1;
   537             return _PyStatus_OK();
   538         }
   539         prefix[n] = L'\0';
   540         reduce(prefix);
   541     } while (prefix[0]);

Finally in calculate_zip_path, since the prefix is found, it’s used to calculate the zip path. Which is correct for our case.

It seems in python3.10, as long as the prefix is found, the zip path is always calculated from the prefix.

Yeah, provided os.py is found, it does seem to work out. So I guess I missed something in the logic and there are no tests for it.

It seems that just changing it to always use prefix instead of PREFIX is the closest way to get back to the old behaviour. It doesn’t search, and it doesn’t check whether the file exists. If we want to do that, it would be a 3.12-only change (unless there’s a shown security risk, which I don’t think there is because there isn’t a search).

I think 3.12-only change is OK. Since we’re already build python from source, we can workaround this with a patch in 3.11.

For the change, I did a test. First I changed the zip file path calculation to this:

    if prefix:
        pythonpath.append(joinpath(prefix, ZIP_LANDMARK))
    else:
        pythonpath.append(joinpath(PREFIX, ZIP_LANDMARK))

Then, two test cases will fail. test_embed and test_getpath.

Then I tried simply removing the venv condition like this:

    # Then add the default zip file
    if os_name == 'nt':
        # QUIRK: Windows uses the library directory rather than the prefix
        if library:
            library_dir = dirname(library)
        else:
            library_dir = executable_dir
        pythonpath.append(joinpath(library_dir, ZIP_LANDMARK))
    elif build_prefix:
        # QUIRK: POSIX uses the default prefix when in the build directory
        pythonpath.append(joinpath(PREFIX, ZIP_LANDMARK))
    else:
        pythonpath.append(joinpath(prefix, ZIP_LANDMARK))

now all tests passed. This may be the simplest change?

Yeah, that may be the best option, provided it still works for a non-installed build.

venv doesn’t really support layouts other than our default install, so if you want it to work with an embedded runtime you probably need to be patching other things anyway. If I were structuring all this from scratch, I’d be putting the venv logic into Programs/python.c rather than in libpython itself, but we’re well past the point where we can make that design change.

Cool. Do you need me to create a PR for this? I can also create a new case in test_getpath.py for this.

That would be awesome if you’d like to. Put an @zooba in the issue and I’ll review and merge for you.

1 Like

Great. One thing is that I’d like to verify the new test case will pass in v3.10.8, so I can make sure the test case itself is correct. But seems there is no Lib/test/test_getpath.py before 3.11. Do I have a way to verify this case in older python versions?

Maybe a test in test_venv makes more sense for the overall scenario? test_getpath is very much about simulating environments and testing the logic.

We can definitely backport a (passing) test back to 3.10.

Ok. I will add test cases in both test_venv and test_getpath and verify the new test_venv case with 3.10.

I have pushed my change to GitHub. I have verified the new test_venv test case will pass in v3.10.8 and fail in v3.11.0. But it’s a little bit tricky, I have to create a fake non-installed python first in order to test the behavior.

Plus I haven’t tested it on PC, so I haven’t created the PR yet. You can review the code here for now. If you think the change is OK I can create the PR and continue the discussion there.

The principle looks fine, so let’s get the issue and PRs created and we can iterate there (with the benefit of automated builds on all platforms).

Most of this topic just sort of flies over my head, but when I upgraded from Python 10.8 to 11.0, my PyCharm Community IDE quit stopping at breakpoints when I was in debug. Nothing I did could make it halt on a breakpoint. I finally deleted 3.11 and reloaded 3.10, and all was fine. The error messages kept saying that there were internal errors in the compilation of 3.11, over which I had no control

That’s far more likely to be due to other changes in Python 3.11. I’d suggest starting with JetBrains support, as they’re the ones who control their debugger.

Steve

For now, I’m just sticking to 3.10 until 3.11 gets some runtime under its belt. I really had no reason to upgrade other than “ooh – shiny!”

Thanks.