Proposing a community maintained database of PyPI package vulnerabilities

Hi!

I’m from Google and my team has been working on some efforts to improve vulnerability management for open source packages.

In particular we’ve started to build a database of vulnerabilities that affect PyPI packages. CVEs are notoriously difficult to match to open source packages and versions, so our goal is to define a standardized shared vulnerability interchange format with precise version/naming that makes them much easier to consume.

An example vulnerability entry would look something like this (more examples here)

id: PYSEC-UNDECIDED-2021-0001
package:
  name: httplib2
  ecosystem: PyPI
summary: Vulnerability in httplib2
details: httplib2 is a comprehensive HTTP client library for Python. In httplib2 before
  version 0.19.0, a malicious server which responds with long series of "\xa0" characters
  in the "www-authenticate" header may cause Denial of Service (CPU burn while parsing
  header) of the httplib2 client accessing said server. This is fixed in version 0.19.0
  which contains a new implementation of auth headers parsing using the pyparsing
  library.
severity: HIGH
affects:
  ranges:
  - type: GIT
    repo: https://github.com/httplib2/httplib2
    fixed: bd9ee252c8f099608019709e22c0d705e98d26bc
  - type: ECOSYSTEM
    fixed: 0.19.0
references:
- https://github.com/httplib2/httplib2/security/advisories/GHSA-93xj-8mrv-444m
aliases:
- CVE-2021-21240
modified: "2021-02-12T14:56:00Z"
published: "2021-02-08T20:15:00Z"

We’ve built out a proof of concept for a workflow that automates most of the work necessary to generate these entries from existing CVE feeds. Once this gets going it should result in very minimal ongoing human maintenance work, and we are happy to contribute time to bootstrap this.

Would you be open to having this database live under Python Software Foundation · GitHub as a community owned database of vulnerabilities?

Our ultimate wish is to see this community database flow into PyPI’s API/UI and eventually the pip command so users can tell if their dependencies are vulnerable. We’ve already started engaging with the PyPI team on this.

Thanks,

Oliver

11 Likes

Moved to the PSF category as the PyPA doesn’t control the “psf” org on GitHub.

1 Like

I like this idea. My simple mind take it as whitelist server that pip install could support so by default it only install package that in the list. But user should also able to disable or use difference server of their own. So event other company could have their own whitelist server and tell pip to use it instead of default when needed

This also always difference level of security could the introduced

1 Like

It would also make legacy eggs arbitrary code execution during pip install or import a no longer a problem in term of security

1 Like

Just for inspiration/precedent, the Rust ecosystem has a similar system for packages on crates.io: Advisories › RustSec Advisory Database and GitHub - RustSec/cargo-audit: Audit Cargo.lock files for crates with security vulnerabilities acting as an auditing tool for dependencies.

1 Like

Indeed!

Ruby also has a community maintained database of vulnerabilities: GitHub - rubysec/ruby-advisory-db: A database of vulnerable Ruby Gems

And there is a Go proposal/prototype to do the same: vulndb - Git at Google

1 Like

Thanks Oliver! I like the general idea. I’ll try to find some time to read your proposal and give feed after I have dealt with 3.10 feature freeze frenzy.

@brettcannon I think @oliverchang’s proposal belongs in the packaging category to draw in more attention from packaging folks.

There is already a commercial security database for Python packages at https://pyup.io/. They release their database once a month at GitHub - pyupio/safety-db: A curated database of insecure Python packages

1 Like

@tiran depends if this is more for the project on GitHub or coming to a general conclusion. I read:

as the key question. If it’s more for the idea then I can move it back.

1 Like

@brettcannon Yes that’s the key question from me :slight_smile:

For more general packaging/warehouse discussion, I opened an issue at github/pypa/warehouse/issues/9407 (I’m not able to post the full link), or should there be another discussion topic on here for that?

Just a note, the psf and pypa orgs are managed by different people and have different criteria to join. If joining psf does not end up making sense (I don’t personally have an opinion but don’t have a say anyway), it may make sense for you to join pypa instead if you are inclined to. See also: PyPA Members, And How To Join — PyPA documentation

3 Likes

Hi Oliver,

I love the idea of such a database! I hope either the PSF or the PyPI guys will love it too…

Thumbs up!

Cheers, Dominik

2 Likes