Pip vulnerability

Hi,

CVE-2018-20225 supposedly applies to all pip versions since 2018. I could not find any official stand on this, whether it’s treated as an active vulnerability, or a feature.

Would anybody share a link to an authoritative statement about CVE-2018-20225?

Thank you in advance! Take care!

Courtesy context link: NVD - CVE-2018-20225

The cve says that the issue is disputed.
It also says what you have to do to be affected.
What is your specific concern?

Keep in mind that pretty much anyone can request a CVE ID assignment
for just about any behavior they want to argue is unwanted. There is
very little justification required by MITRE, and some CNAs may even
assign CVEs with no required justification at all.

In case it’s not clear (this is an oft repeated mantra in places
like the oss-security mailing list but less so here on DPO), the
existence of a CVE does not necessarily imply the existence of a
security flaw.

As for this specific CVE, you may want to check out PEP 708
“Extending the Repository API to Mitigate Dependency Confusion
Attacks” along with the related discussions here on DPO.

Thank you guys.

So is fact that the cve is not resolved since 2018 due to python community not taking it as a real vulnerability?

My problem with that is somewhere pip gets tagged by some cyber tool with this CVE and it’s blocked as vulnerable code…

The issue you referenced can ONLY be a problem if:

  1. You use a private package repository
  2. You also use the main upstream repository
  3. You install something from your private repository
  4. An attacker knows the package name you’re using, and uploads a higher-numbered version of the same package to the upstream repo.

Are you using a private repository? If not, ignore the issue, it can’t apply to you.

Just using pip is a security issue.
In a production environment you cannot risk running code downloaded by pip being malicious.

Where i work we get the source code, review it, package it ourselves and only use our copy.

Any security scanner which blindly assumes all CVEs represent an
exploitable condition in your environment is fundamentally flawed.
Consider looking for scanners which take a more direct approach to
checking systems security, or at least see if the one you’re using
allows you to adjust the list of tests to those which are relevant
to your environment.

The underlying problem is, like with many things in this industry,
to properly evaluate the security of your systems you need both a
strong understanding of those systems and the potential
vulnerabilities. As long as you treat it like an inscrutable black
box, you won’t have useful results from any of this class of tools.

4 Likes

But that is a very likely scenario for any large project. You will have local packages, and you use pypi packages.
I just read the pip docs, and indeed, even with --index-url, pip looks at the specified url, and pypi, in any order.

pip looks for packages in a number of places: on PyPI (if not disabled via --no-index), in the local filesystem, and in any additional repositories specified via --find-links or --index-url. There is no ordering in the locations that are searched. Rather they are all checked, and the “best” match for the requirements (in terms of version number - see the specification for details) is selected.

So even with version pinning, a package on pypi that shadows your private one with the same version number could be loaded in preference.
This means that pip has no way to implement the functionality that everybody thinks it has - get my private packages from my private repository, and get the public ones from pypi.

The expected behaviour is that:

  • the repository specified by --index-url is searched first
  • the version found in the first repository is loaded always
  • pypi is only searched if the package is not found in the local repository

Without the ability to do this, the use of local repositories is unsafe except under vary carefully controlled conditions. The fact that pip has no way of implementing the “safe” behaviour is a bug.

Is it? I don’t remember seeing any major open source projects working this way, so maybe that’s only a corporate thing?

I disagree. If there is a conflict, refuse the temptation to guess. Sane behaviour would be to either consider all the repositories to be one (in which case you take the highest version, regardless of origin), or to pin something to a specific repository, but “first repository wins” is just as vulnerable, just to different sorts of problems. If you want to be safe against this knid of attack, it’d need to error out rather than magically taking the one that you want it to take.

Apparently PyTorch got burned by exactly this.

Ok, so print an error - at least it prevents you shipping compromised code, I suppose. Right now, it is really just guessing, anyway.

No, that in most cases is not sane, due to the uncertainty as to where the package came from.

Explain. At least the package source is deterministic.
Even if “adding to the repository pool” is a valid use case, “my packages come from my repo” is also valid, and not covered by pip’s functionality at all. A fact that has been a nasty surprise to a lot of people, hence the CVE.