Pip must notify people that they have been compromised by a malicious package

When a malicious package is identified, I strongly believe we have a moral obligation to notify people that they have been hacked. I work in the industry, and cleaning up from a breach is never fun - but one thing is universal, the longer the attacker has access, the more expensive it is to clean up. Additionally, failing to notify people that they have been breached may even be against California and EU law.

Malware distributed by pypi needs to marked as malicious and notify users with pip that they have been hacked. One option here is that pip upgrade should throw an error, and there should be a way to remove the malware package from the system.

Without notifying people, the criminals that are profiting from pypi’s lack of security - will continue to do so because of the lack of evidence to bring them to justice. I would like to see pypi work closer with law enforcement to share details about these attacks, “if you see something say something.”

I’d also like to see the stats of how many people are hacked each month. I think that hiding this information is bad optics. This lack of transparency is and ultimately bad for the community, because the squeaky wheel gets the oil. Hiding how many people are hacked, allows some people in the community to dismiss pypi’s continued problems with distributing malicious code. This as a kind of “wolf in sheep’s clothing,” and I have seen very talented senior engineers get hacked because of a typo.

There is some existing discussion on this, please see Feature request: Automatically uninstall malicious packages taken down from PyPI · Issue #5777 · pypa/pip · GitHub.

We do, see PyPI was subpoenaed - The Python Package Index Blog.

We provide download statistics for all packages, but inferring “how many people get hacked” from that is challenging, because “one download” doesn’t mean “one install” (PyPI has a large number of static mirrors) and “one install” doesn’t necessarily mean “one hack”.

12 Likes

Thanks @dustin, for the thoughtful responses and links. The PyPI team is on top of things.

2 Likes

I’d agree with @pf_moore (in the linked issue) that silently removing something isn’t a great way to go. Alert? By all means - though not sure what form that should take. Remove? Maybe… there’s rather too much of people doing things to other people’s computers for their own good. But removing silently is not only bad form - the owner of a computing device has a right to know what has been done to it - but also risks not being enough, a well crafted piece of malware, once activated, could well replicate itself into another form that won’t be eliminated by pip uninstalling the original package, so “silent” is really bad here.

2 Likes

I don’t think there’s anyone in that issue suggesting that it happen silently.

Are you implying that PyPI has a moral obligation to retain a full list of every package you have downloaded? That sounds like a bit of a privacy issue to me.

OTOH if you just mean that there needs to be a list of “most recently removed malicious packages”, that’s something that seems a lot more reasonable. It isn’t going to notify you though - you would still be responsible for monitoring the list to see if there are any problems.

Do you mean the number of people who download those packages, or do you need them to be more thoroughly tracked in order to find out if any compromise happened? Again, this is definitely a level of tracking that I would not want to see happen.

5 Likes

To add to this, I agree that PyPI should provide an API to report what malicious packages have been removed, and installer tools (pip, pdm, poetry, etc.) should alert if those packages are present in a users environment, the API is likely to ultimately be more beneficial than the alert.

As, presumably, any well crafted malicious package once installed will remove the part of the code from the installer tool that alerts the user.

Whereas an API could be used for many different use cases, such as AV tools, firewalls, etc.

1 Like

I will gently remind everyone that no matter “moral obligations”, improvements don’t get done if nobody works on them, and PyPI is running short-handed (plus, if you read https://blog.pypi.org, you can’t deny that they are already putting a lot of effort into security with the budget they have). The same holds for pip. The pip and Warehouse issues are open and awaiting implementations.

6 Likes

As a gentle reminder of context here, the only pip command that scans the whole user’s environment (every package that is present in the environment) is pip check, which can be run manually by the user to validate their environment.

While it would be possible to use a PyPI API like the one described to determine if there are malicious packages present, pip would likely implement that under pip check, so the user is still responsible for initiating the check. And of course, it’s not actually that hard to write a standalone checker that lists all the installed packages and checks that against a PyPI list - so a manual check (such as might be run as part of an audit) doesn’t need to be built into pip.

However, pip does run a pip check after completing an install, so having the scan included as part of pip check would offer a certain amount of additional security. But note that we don’t guarantee that we’ll always run pip check every install - it’s costly, and we do it to ensure that newly installed packages are compatible with installed ones, so if we ever find a way to achieve that compatibility without running a full pip check, we might choose to do so.

2 Likes

And, relatedly, this exists already:

4 Likes

Great to know about this already fixed some existing packages

a good footnote to this topic: 116 Malware Packages Found on PyPI Repository Infecting Windows and Linux Systems, or you are actually implying it already? :wink:

Hasn’t Pip, since the new resolver was added, run a check at the end of each install?

I’m pretty confident that if you install something that conflicts with your existing environment Pip tells you at the end of the install without the need to manually run pip check.

I would suggest reading the entire message before formulating a reply:

Sorry just got off a red eye flight when I wrote that. I clearly misunderstood the sentance “the only pip command that scans the whole user’s environment … is pip check”.