PyPi Typo-squatting: Is there any mechanism to delay new package name visibility?

ssweber · April 3, 2024, 3:50pm

I’ve read about the different “typo-squatting” attacks against PyPi, and was wondering if a delay in “registration of new package name” → “pip installable” could help?

I understand a day or 2 delay in publishing a new package name might annoy some people, but humans need to sleep, have days off, and such a buffer might make it easier on all those involved with alerting and taking action against malicious type squatting attacks.

Disregard if there is already such a mechanism!

apalala · April 3, 2024, 7:18pm

My understanding is that there’s an initiative to add namespaces to PyPi, so instead of:

pip install alibrary

It would be:

pip install trustednamespace.alibrary

Or the likes.

bschubert · April 3, 2024, 7:30pm

Some related threads:

woodruffw · April 4, 2024, 2:35am

There is no current mechanism that I’m aware of for this.

PyPI does have a separate, unrelated, mechanism for limiting “similar” package names: if a new package has an “ultranormalized” form that matches any other pre-existing package, it gets rejected. In practice, this means that you can’t upload py-fo0 if py-foo already exists.

Delaying package visibility had come up a few times before (mostly in the context of allowing package maintainers to do “atomic” uploads or undo erroneous publishes without having to yank), and I think there’s general consensus that it could be useful. But nobody (AFAIK) has done the work to propose a design or actually implement it

That being said, I’m curious how you think visibility delay helps with the typosquatting problem: I would expect most typosquatting to be opportunistic anyways, so the attacker can afford to be patient. Are you thinking that a delay would give the index’s admins more time tor review changes? This would be true, but I suspect the volume of new packages would overwhelm any manual review efforts. It’d need to be an automated process, at which point the security value of a delay is somewhat limited.

ssweber · April 4, 2024, 10:06am

I got the impression from previous blog posts and incident reports like https://status.python.org/incidents/dc9zsqzrs0bv that sometimes these events happen in a very short period at opportune times (such as weekend), making it more difficult for the PyPi admins involved.

A buffer might not help single malicious typo-squat package, but may help in the cases where hundreds or thousands of packages are attempted at once.

But I may not be understanding the scale of valid new package names being added everyday!

woodruffw · April 4, 2024, 8:25pm

They do, but I’m not sure (that’s not a negative, it’s genuine ignorance!) whether those primarily leverage typosquatting or whether it’s something else.

I think my underlying concern is whether a package delay actually improves the remediation process, or just shuffles it around a bit – I could see it just as easily becoming a burden or DoS vector, where an attacker holds up the release of legitimate packages by filling the queue with spam.