[disclaimer: I write better in french than globish.]
[disclaimer: forum’s stupidity force me to break most links.]
Hi,
Some facts first
- I have recently been powned due to a typo in package name (request instead of requests).
- I wrote my story here (in french) : https://metrodore.fr/i-have-been-powned.html
- There is no CVE, no public statement about the malware. It is not found in publicly available database.
- The trojan stayed on my machine during ~20 days after it was removed by the pypi admin.
- There is almost no publicity about this menace (I found only one serious asian source alerting about it, reproduced in two asian web media).
- Pypinfo reports that between name-squatting beginning (31 July) and removal, about 12000 downloads have been done.
Then I wish to share some thought about the current situation. I hope it can help to make a better python movement. Also I suppose that many popular software repos have similar difficulties (npm, cargo, etc.), I suppose they also may have produced thought and answers to be analyzed.
Ideas about struggling against pytosquatting consequences
The main idea is to publicize the menace in order to reduce occurrences of naive hoster of the nasty code (who can naively host the menace many time after the pip package as been removed, upstream or downstream).
Public statement about removals
When a pypi package is removed by admin for security reasons, a public statement should be emitted.
This could be done without extra work for pypi admins if warehouse takes it in charge. The admins should simply motivate the removal reasons and if possible links to resources helping infected people to understand and mitigate the menace.
Imho there is no scenario occurring better if the menace is shut up compared as if it is pinned.
Maybe the are good reasons to do the same when removal is done for other reasons (policy violation).
Exploit the upgrade way
Currently, after a pytosquatting, pypi admins attempt to prevent the re-creation of another project with the same name.
A way to advertise the menace among infected users exist when they upgrade package. Currently they simply get an error due to the upstream removal and may not understand they have been victims to pytosquatting. One can imagine pypi admins could maintain an upgradable version of the package in some “ultimate version”. This upgrade could lead during setup to an exception with an highlighted message advertising the menace and linking to removal statement.
Against, this could be done without extra work for pypi admins if warehouse takes it in charge.
Notes:
- This upgrade would break users’ upgrade, but no more than after an upstream removal.
- The ability to build an upgradable but non-installable package is beyong my knowledge of pip/pypi.
Ideas about struggling against pytosquatting
Not reversing responsibilities
I did read an advice looking like “you can’t trust anything on PyPI, pin & hash your dependencies” as a standard response about pypi security. Fortunately this statement is not the official one and don’t appears on:
- pypi. org/security/
- packaging .python.org/tutorials/installing-packages/
- packaging .python.org/guides/analyzing-pypi-package-downloads/
- pypi .org/help/
- python-security .readthedocs.io/packages.html
I say “fortunately” because this kind of statement looks like counterproductive for me and I suppose is not the intended purpose of pypi. Here is how I receive it: “hey look, we have a wonderfull infrastructure maintained by thousand wonderfull volunteers helping millions of developers; but we urge you to not use it if you cares about your butt”.
More seriously, checking individually pip package is almost impossible. It is very hard to trust code you didn’t wrote. So we have two alternatives: going toward a community where we can trust each others, or going toward stopping using pypi.
I would prefer the first one. More generally, even if things are difficult, I think it is important that the python overall community have in mind that it is important to remains a safe place to remains (and not pleasant to leave).
Helping pypi admins to detect squatting occurrences
See: github .com/pypa/warehouse/issues/4998
And more generaly: github .com/python/request-for/blob/master/2019-Q4-PyPI/RFP.md#milestone-2—systems-for-automated-detection-of-malicious-uploads
Make a policy avoiding convenience typosquatting
If I correctly understood, the request package have been for a long time a non-nasty package. The owner suppressed it and only after it has been squatted.
Imho, even a convenience typosquatting is a bad idea. Here I see two options:
- If typosquatting is done by the same owner than original package, it is still probably a bad idea but quite safe.
- Else it should be avoided by policy, enforced by a creation reject. Because unfortunately, an informal convenience hack done by an isolated person have many chances to be abandoned and in long term to leaves the name into nasty hands.
Different owner mean different package
Note: I didn’t tested the current behavior wrt this purpose.
Imagine a package owner abandon it. If another owner makes a new release, the old release should by default not be upgradable to the new release. Maybe the package’s html page and any published description should also be prepended by a notice explicating the name recycling.
For legitimate owners changes, this behavior could be avoided by a transfer procedure (looking like dns name transfers) and/or using multiple owners for a package (then allowing to add/remove them dynamically).
Use redemption delays
After abandon, a package name should be blocked for some time. This would not protect from future new installations, but it should protect from pip updates (assuming pip allow to upgrade between different owners) and from users minds which may have not noticed the different nature of the package owners and/or purpose.
Against, for legitimate owners changes, this behavior could be avoided by a transfer procedure and/or using multiple owners.
Improving pypi uploaders trustability
Pypi isn’t the first open community in the free software movement. As an example, Debian is an open community involving thousands peoples around the world. To my knowledge, there is almost no occurrences of nasty installed software from Debian’s repos. I could says the same about Fedora, Gentoo, etc. In comparison, pypi looks like a malware nest.
I am not specialist in Debian organization, but I suppose there are learning to get from their organization and from other open organization in free software movement:
- package signing
- web of trust
- developers cooptation
- package review?
I imagine pypi cannot move from day to another in such organization; and maybe hasn’t the wish to makes trust/cooptation as a prerequisite to have the right to publish.
By the way one can imagine pypi adopting progressively package signing and owners trusting as a good practice to be enforced. Trusting owners could depend as a first step not only on cryptographic cooptation, but could also be done by some scoring of choosen parameters (packages published and popularity, seniority, teams membership, coworking distance to trusted owners, etc.). Pip could have a default option --enable-untrusted which could become after years --only-trusted.
Future is open.