I’ve read about the different “typo-squatting” attacks against PyPi, and was wondering if a delay in “registration of new package name” → “pip installable” could help?
I understand a day or 2 delay in publishing a new package name might annoy some people, but humans need to sleep, have days off, and such a buffer might make it easier on all those involved with alerting and taking action against malicious type squatting attacks.
There is no current mechanism that I’m aware of for this.
PyPI does have a separate, unrelated, mechanism for limiting “similar” package names: if a new package has an “ultranormalized” form that matches any other pre-existing package, it gets rejected. In practice, this means that you can’t upload py-fo0 if py-foo already exists.
Delaying package visibility had come up a few times before (mostly in the context of allowing package maintainers to do “atomic” uploads or undo erroneous publishes without having to yank), and I think there’s general consensus that it could be useful. But nobody (AFAIK) has done the work to propose a design or actually implement it
That being said, I’m curious how you think visibility delay helps with the typosquatting problem: I would expect most typosquatting to be opportunistic anyways, so the attacker can afford to be patient. Are you thinking that a delay would give the index’s admins more time tor review changes? This would be true, but I suspect the volume of new packages would overwhelm any manual review efforts. It’d need to be an automated process, at which point the security value of a delay is somewhat limited.
I got the impression from previous blog posts and incident reports like https://status.python.org/incidents/dc9zsqzrs0bv that sometimes these events happen in a very short period at opportune times (such as weekend), making it more difficult for the PyPi admins involved.
A buffer might not help single malicious typo-squat package, but may help in the cases where hundreds or thousands of packages are attempted at once.
But I may not be understanding the scale of valid new package names being added everyday!
They do, but I’m not sure (that’s not a negative, it’s genuine ignorance!) whether those primarily leverage typosquatting or whether it’s something else.
I think my underlying concern is whether a package delay actually improves the remediation process, or just shuffles it around a bit – I could see it just as easily becoming a burden or DoS vector, where an attacker holds up the release of legitimate packages by filling the queue with spam.