PyPI security work: multifactor auth progress & help needed

Hi, Python packaging colleagues! As Ernest blogged last week, a team has kicked off work on improving Warehouse security, accessibility, and internationalization. See the blog post & links for more details and who’s working on what, but our first milestone is:

  • Support for two-factor authentication via TOTP and U2F/FIDO.
  • Application-specific tokens scoped to individual users/projects (this will also cover adding token-based login support to twine and setuptools)
  • Advanced audit trail of user actions beyond the current journal (allowing publishers to track all actions taken by third party services on their behalf).

As project manager, I’ll be sending progress reports about twice a month, and posting meeting notes on the wiki.

Engineer William Woodruff of Trail of Bits is working on TOTP support. UX designer and developer Nicole Harris is reviewing that work, working on relevant help text, and developing the user experience that multi-factor auth and our other objectives will require.

And today a few of us discussed several open issues. If you’d like to help out, we’d love volunteer help with:

Want to help? Check out our Warehouse’s developer environment setup docs and tell us if you have trouble getting started!

And please tell me if you’re planning to join us at sprints at PyCon North America, May 6th-9th, so we can plan tasks.

More next month, including some schedule estimates.

Thanks to the Open Technology Fund for funding this work!

Sumana Harihareswara, Warehouse project manager

7 Likes

I’d be particularly interested to hear if folks have thoughts about how to do this is a easy, secure way. We generally try to avoid doing introspection into the individual distributions, but it seems this would require at least some interaction with the file contents.

I don’t want to scope creep, but this could also be extended to make sure .whl files are .zip like as well. With a check that mandatory files are present.

1 Like

We already do that: https://github.com/pypa/warehouse/blob/774e9021171c6ec6572ae6bc0430026721a18039/warehouse/forklift/legacy.py#L616-L631

1 Like

More scope creep: can we just upload zip files instead of two different file types for .tar.gz and .zip via .whl? :wink: (I know this isn’t going to happen, but I can dream …).

2 Likes

The code already pokes inside zip/whl/egg files as you’ve noticed (zip contents validation). PR created, but I do agree with Brett, long term we should pick a single archive format and only support that on PyPI. An interim period where we auto-convert from the other formats to the desired one would make sense.

If we care about storage and network bandwidth costs zip and .tar.gz are both less than ideal. But a single format is a larger conversation. #dream

There were some long threads about file formats on PyPI back in 2016:

That eventually led to PEP 527. The biggest change in PEP 527 was to slim down PyPI’s support matrix from a dozen+ file formats to just 3: wheel, .tar.gz sdist, and .zip sdist. There was a lot of debate about whether to reduce that further to just one format of sdist, and if so, whether it should be .tar.gz or .zip. @dstufft’s original PEP draft made .tar.gz the only support format, IIRC based on it being by far the most common in actual usage, but it looks like the final draft relaxed that.

Then in PEP 517, we needed to specify a generic interface for producing sdists, and we made it use .tar.gz, basically because that’s what @dstufft liked better: https://www.python.org/dev/peps/pep-0517/#build-sdist

I think our options going forward are:

  • Continue to support both .zip and .tar.gz sdists on PyPI (status quo).
  • Drop support for .zip sdists. This would be pretty simple: we’d stop allowing .zip uploads, and that’s basically it. We’d be committed to handling both .zip and .tar.gz everywhere forever, for wheels and sdists respectively. This obviously isn’t a prohibitive cost (we do it now), but is some cost.
  • Drop support for .tar.gz sdists. This would be more involved: we’d have to first update PEP 517, and change setuptools to default to generating .zip sdists. (And ideally make the same change to other build backends, like distutils and flit.) And after that was deployed for a bit and people adapted to .zip sdists being common, then we could stop allowing .tar.gz uploads. The upside would be that .zip is somewhat easier to work with programmatically (since it supports random access), and that it matches wheels. And the downside of course is the transition costs.

(It’s an interesting side-effect of using discourse, that suddenly these conversations are drawing in folks who didn’t follow this history!)

2 Likes

I made a PR for this. I don’t have anything beyond normal-peon access to the warehouse repository so someone else will have to assign/review/merge if desired. This was my first time touching the warehouse codebase. “such docker. oh my.”

Thank you @gpshead for your PR!

Hi, Python packaging colleagues!

We continue to work towards our first goal: support for two-factor authentication on PyPI via TOTP and U2F/FIDO. William and Nicole are continuing their development and design work as I mentioned in the last update, with additional work by Mark Mossberg at Trail of Bits, plus Ernest, Dustin, and Donald advising and reviewing.

We are working out our rollout plans for multifactor auth, and so we don’t yet have an estimate for when we’ll deliver that and when we’ll start the API keys or audit trail work. But the existing work-in-progress PR for MFA is ready for you to try out and play with now, and we’ll have more for you to try out next month at the PyCon sprints.

Want to help?

Thank you @gpshead for your PR to validate whether uploaded packages ending in tar.gz are actually tarballs!

And please speak up in this topic if you’re planning to come to sprints at PyCon North America, May 6th-9th, so we can plan tasks.

We’ll send another progress report around mid-month. That’s also when PSF aims to announce another Request For Information for Warehouse security improvements: “highly requested security features in PyPI such as cryptographic signing and verification of files uploaded and installed from the index” (possibly using TUF).

Thanks to the Open Technology Fund for funding this work!

Sumana Harihareswara, Warehouse project manager

4 Likes

Installation and setup was a breeze. I’ll be there for four days of sprints and would love to contribute. I am the architect and former security lead on a Python based MFA API. I hope I’ll be able to lend some SME help along with knocking out tickets. :monkey_face:

1 Like

How does the plan to implement TUF (and maybe webauthn) dovetail with this work?

What issues need more attention from volunteers with which skills?

1 Like

What’s the GH Issue # for the API keys work; is the system being designed so that I can create a per-package key (so that I’m not delegating all privileges to my CI builds and CI build systems)?

We’re excited to help with the encryption support aspect whatever is decided. Just point us at the best way to dive in. I can’t personally make it but I can check to see if someone from our team can attend PyCon if being there in person would be a major help…

2 Likes

Yeap, same here, ready to help with the PyPI + TUF effort, just let me know

Cc @pradyunsg

1 Like

Lukas from the TUF team at NYU here. I will be at the PyCon sprints next week, and I’m looking forward to discuss and work on PyPI + TUF integration.

2 Likes

Summary: Work continues on Milestone 1, Security Feature Development, and specifically on the Multi-Factor Authentication task. TOTP-based 2FA is about to roll out for everyone, and we’re working on WebAuthN (e.g., Yubikeys).

In April and in the first week of May, the team finalized the backend and user experience for 2FA and planned and started user tests. On May 2nd, we began advertising the test to our users and requesting feedback. That test has already found issues which we have fixed or prioritized to fix later.

We planned for the test to go till May 20th. We decided to require email verification before activating 2FA, and that’s underway. Once we finish that, we’ll be turning on the optional 2FA login feature for current and future accounts on pypi.org (it’s already on for most existing accounts on Test PyPI, and we’ll turn it on for all current and future accounts there, too.) There are some UI issues that we should fix in the medium term, but I’ve decided it’s ok to roll out the feature and end the beta before fixing those.

Thanks to everyone we spoke with at the PyCon sprints and who worked on Warehouse and other packaging projects, including by testing two-factor auth, learning to package for the first time, and reviewing open pull requests! And thanks to volunteer contributors lukpueh, MattIPv4, hugovk, vinayak-mehta, HonzaKral, alex, alexwlchan, ppiyakk2, ofek, theb10n707, jamadden, ALDamico, and DavidBord for pull requests!

Our backend development contractors, in particular William Woodruff but also his colleagues at Trail of Bits, finished their TOTP-based multi-factor authentication pull request and responded to reviews from community maintainers, who approved and merged it. They then began work on WebAuthN-based multi-factor authentication, which is in progress and will let you use, for instance, Yubikeys for your second factor. So far that has included a fix to an upstream library – thank you, Duo!

The Frontend and UX contractor, Nicole Harris, finalized the review for the user experience for TOTP-based multi-factor authentication and started to define PyPI’s manual account recovery process, and is working on improving the WebAuthn authentication and provisioning user experience.

The project manager (me) also ran sprints at PyCon (see details at https://wiki.python.org/psf/PackagingSprints and http://bit.ly/pypa2019 ), and expanded our test period publicity in mid-May to ensure we reach users in important technological categories (e.g., who have slow internet connections).

In case you’re curious what issues and bugs we’ve found so far, check out some examples:

Next steps! Check out the OTF security work milestone on GitHub.

  1. Finish WebAuthN.
  2. API keys including adding scoping for users & projects. Heads up @westurner. :slight_smile:
  3. Then that will make adding audit trails/logs easier (reusing scoping and what any given token is being used for).

And we’ll probably be able to parallelize a bit and have Nicole start on Milestone 2 (Accessibility and internationalization development) before we’re quite finished shipping all 3 of those.

As a reminder, TUF and cryptographic signing is NOT in scope for this current project, and will only start after we’re done with the current project. The TUF GitHub issue and PyCon sprint notes are a good place to comment if you want to talk about TUF!

We’re looking forward to continuing to ship components of our project as we progress. And, as always, you can read our notes at https://wiki.python.org/psf/PackagingWG .

Thanks again to the Open Technology Fund for making this work possible!

3 Likes

Thanks for implementing 2FA! Is it possible to add more than one TOTP app, or to have some kind of recovery codes? I’m worried that losing my phone would lock me out of my account.

Hi, @jks – thanks for your questions!

We haven’t yet implemented recovery codes – here’s the GitHub issue for that – and it’s something we plan to do, but it’s not as urgent as making progress on WebAuthn and the other OTF-funded milestones. We’re trying to maximize how much progress we make on core features (WebAuthn, API keys, the audit log, and so on) using the OTF funds.

If you lose your phone, just like if you lose/forget your password, you can ask for a manual account reset/recovery. And that is a reason why we’ve mandated that you have to verify an email address on your account before you can turn on 2FA. :slight_smile: We’re also polishing our manual account recovery policy – once we’ve firmed it up more, we’ll put that in the FAQ, and once we have recovery codes implemented, that’ll be in the FAQ as well.

2 Likes

Ok, good to hear that there is a recovery process. In that case I completely agree that the other features are more urgent.

1 Like