Vsyscall issue: glibc patching history

Dear All,
I am Subin and I work in RedHat.
I am trying to capture all the details of the glibc patching done in manylinux2010 PEP(https://www.python.org/dev/peps/pep-0571/#compatibility-with-kernels-that-lack-vsyscall)

I was able to dig this email https://mail.python.org/pipermail/wheel-builders/2016-December/000239.html to get some context.
Also this https://github.com/pypa/manylinux/pull/279 .
But I wanted to get the facts right.So I am reaching out to the community on this.
I like to know if there is a system out there in use by community members where vsyscall flag cannot be set to emulate and this patching is absolutely necessary in the Build Image to build a manylinux2010 compliant wheel file ?
If there are more details/links which capture the origin of the patch in the PEP I would be very grateful.

Thanks
Subin

I don’t know the patching history, but I can tell you that the vsyscall page is definitely not required under any circumstances starting with glibc 2.14, and might not have been used under normal circumstances for some time before that.

We got many reports from real people running into trouble with this. It’s particularly tricky because the symptom is that you do docker run and it exits immediately with no error message at all, so everyone who hits it spends a few hours completely baffled about what they’re doing wrong.

I don’t know if it’s “absolutely necessary”, but even if we can force people to reconfigure their kernels and reboot, we’re not going to – the user experience is terrible. OTOH the hacks to the build image are kinda gross, but they completely solve the problem for everyone.

Also I think maybe Travis has vsyscall emulation disabled? I don’t remember for sure, and anyway it might have changed some last time I checked, so you’d need to do your own experiments. But lots of people use Linux systems where they don’t have the ability to change the kernel command line.

1 Like

I understand the question to be whether it’s still necessary to patch the manylinux201[04] build images to avoid using the vsyscall page. The need to avoid using the vsyscall page has not gone away, but the patches may now be superfluous, if CentOS 6/7 glibc already doesn’t use the vsyscall page. As I said, the code to use the vsyscall page was removed from glibc in version 2.14, but earlier versions (I’m not sure how much earlier, the code is a mess) would already look for and use the vDSO in preference to the fixed page, so it’s quite plausible that the patches are unnecessary.

It sounds like it would be easy to test: remove the patches, rebuild the build images, and try to docker run those images on a host kernel with the vsyscall page disabled (vsyscall=none on the kernel command line; this is the default mode already in current distributions; if cat /proc/self/maps doesn’t print a line near the bottom with [vsyscall] in the far right hand column, it’s already disabled).

(If anyone is wondering what the heck this vsyscall thing is, see https://lwn.net/Articles/446528/ .)

2 Likes

The manylinux2010 images definitely needed patching as of a few months ago. So that would only change if either CentOS recently started applying similar patches themselves, or if all the distros and administrators in the world had recently started enabling vsyscall emulation. I don’t know any reason to suspect that either has changed.

I agree, there’s no reason to think that this would have changed in the past few months. This is, perhaps, another reason for package uploaders to just skip the 2010 build environment.

No. This has no effect on package uploaders whatsoever – it’s already solved in the 2010 build environment. And until CentOS/RHEL 6 reaches EOL, we should be recommending manylinux2010 as the standard build environment that folks reach for unless they have some specific reason they can’t use it.