I have a bit of uneasiness about how this is heavily pushing GitHub actions as the correct way to publish to PyPI. I had to check PEP740 to make sure it was not directly supported by Microsoft.
> The generation and publication of attestations happens by default, and no changes are necessary for projects that meet all of these conditions: publish from GitHub Actions; via Trusted Publishing; and use the pypa/gh-action-pypi-publish action to publish.
If you then click on "The manual way" it adds a big disclaimer:
> STOP! You probably don't need this section; it exists only to provide some internal details about how attestation generation and uploading work. If you're an ordinary user, it is strongly recommended that you use one of the official workflows described above.
Where the only official workflow is "Use GitHub Actions".
I guess I am an idealist but as a maintainer this falls short of my expectations for the openness of Python and PyPI.
> Where the only official workflow is "Use GitHub Actions".
The standard behind this (PEP 740) supports anything that can be used with Trusted Publishing[1]. That includes GitLab, Google Cloud, ActiveState, and can include any other OIDC IdP if people make a good case for including it.
It's not tied to Microsoft or GitHub in any particular way. The only reason it emphasizes GitHub Actions is because that's where the overwhelming majority of automatic publishing traffic comes from, and because it follows a similar enablement pattern as Trusted Publishing did (where we did GitHub first, followed by GitLab and other providers).
I get that, that's why I didn't go "This is Embrace Extend Extinguish", but as constructive feedback I would recommend softening the language and to replace:
Perhaps also add a few of the providers you listed as well?
> The only reason it emphasizes GitHub Actions is because that's where the overwhelming majority of automatic publishing traffic comes from
GitHub being popular is a self-reinforcing process, if GitHub is your first class citizen for something as crucial as trusted publishing then projects on GitHub will see a higher adoption and become the de-facto "secure choice".
That’s now how any of this works. I am begging you to re-read the docs and understand that this does not require anybody to use GitHub, much less GitHub Actions, much less Trusted Publishing, much less attestations.
You can still, and will always be able to use API tokens.
The gp was pointing out that the docs heavily emphasise (& therefore encourage) GHA usage & suggested language changes.
If people are confused about what they need to use Trusted Publishing & you're suggesting (begging) a re-read as the solution, that seems evidence enough that the gp is correct about the docs needing a reword.
It could just as easily imply that people aren't paying attention when they read it. Inability to understand a text is not always on the author, plenty of times it's on the reader.
Well, yes and no. From the perspective of an infosec professional who is focussed on supply chain security I can tell you that your package having an attestation from a trusted platform like GitHub or GitLab gives me a warm feeling. It is not the only thing we will look at but definitely part of a longer list of checks to understand risk around dependencies.
With an attestation from GitHub I know at least that the workflow that ran it and the artifacts it produced will be 100% traceable and verifyable. This doesn't mean the code was not malicious, but for example it will rule out that someone did the build at home and attached an alternative version of an artifact to a GitHub release. Like how that was done with the xz project.
It is fine to not like GitHub, but I think that means we need more trusted builders. Developers cannot be pushed toward just GitHub.
Thank you for being patient with people that seem to have willfully not read any of the docs or your clarifying comments here, are saying you are lying, and/or are making up hypothetical situations. It's appreciated!
Edit: woodruffw is sitting here and thoughtfully commenting and answering people despite how hostile some of the comments are (someone even said "This is probably deserving a criminal investigation"! and it has more upvotes than this comment). I think that should be appreciated, even if you don't like Python.
I know attestations are not mandatory, but the rug has already been pulled: PEP 740 distinguishes "good" and "bad" packages and "good" packages require GitHub.
Attestations are worthless unless they're checked. I have no doubt they'll eventually become the default in pip which effectively makes them mandatory for 99% of people not willing to jump through the hoops of installing an unattested package.
The "hoops", which will only increase in the future, make GitHub-dependent attested packages privileged and give GitHub (and maybe, in the future, other inappropriate entities) significant power over open source Python packages.
It does nothing of the sort, and the current GitHub requirement is an explicitly temporary restriction, like it was for Trusted Publishing. Again: I am begging you to read the docs that we’ve compiled for this.
> the current GitHub requirement is an explicitly temporary restriction
It seems reasonable to suggest that advertising a solution for public use at a point in time when support is at <2 systems is not an ideal way to encourage an open ecosystem.
It’s an eminently ideal way, given that the overwhelming majority of Python packages come from GitHub. It would be unreasonable to withhold an optional feature just because that optional feature is not universal yet.
Again, I need people to let this sink in: Trusted Publishing is not tied to GitHub. You can use Trusted Publishing with GitLab, and other providers too. You are not required to produce attestations, even if you use Trusted Publishing. Existing GitLab workflows that do Trusted Publishing are not changed or broken in any way by this feature, and will be given attestation support in the near future. This is all documented.
Let this sink in: a "security" feature that depends on Trusted Publishing providers puts the developer at the mercy of a small set of Trusted Publishing providers, and for most people none of them are acceptable feudal lords.
Let this sink in: if it is possible to check attestations, attestations will be checked and required by users, and PyPI packages without them will be used less. Whether PyPI requires attestations is unimportant.
> PyPI packages without them will be used less. Whether PyPI requires attestations is unimportant.
This points to a basic contradiction in how people are approaching open source: if you want your project to be popular on a massive scale, then you should expect those people to have opinions about how you're producing that project. That should not come as a surprise.
If, on the other hand, you want to run a project that isn't seeking popularity, then you have a reasonable expectation that people won't ask you for these things and you shouldn't want your packages downloads from PyPI as much as possible. When people do bug you for those things, explicitly rejecting them is (1) acceptable, and (2) should reduce the relative popularity of your project.
The combination of these things ("no social expectations and a high degree of implicit social trust/criticality") is incoherent and, more importantly, doesn't reflect observed behavior (people who do OSS as a hobby - like myself - do try and do the more secure things because there's a common acknowledgement of responsibility for important projects).
I don't use PyPI and only skimmed the docs. I think what you're saying here makes sense, but I also think others posting have valid concerns.
As a package consumer, I agree with what you've said. I would have a preference for packages that are built by a large, trusted provider. However, if I'm a package developer, the idea worries me a bit. I think the concerns others are raising are pragmatic because once a majority of developers start taking the easy path by choosing (ex) GitHub Actions, that becomes the de-facto standard and your options as a developer are to participate or be left out.
The problem for me is that I've seen the same scenario play out many times. No one is "forced" to use the options controlled by corporate interests, but that's where all the development effort is allocated and, as time goes on, the open source and independent options will simply disappear due the waning popularity that's caused by being more complex than the easier, corporate backed options.
At that point, you're picking platform winners because distribution by any other means becomes untenable or, even worse, forbidden if you decide that only attested packages are trustworthy and drop support for other means of publishing. Those platforms will end up with enormous control over what type of development is allowed. We have good examples of how it's bad for both developers and consumers too. Apple's App Store is the obvious one, but uBlock Origin is even better. In my opinion, Google changed their platform (Chrome) to break ad blockers.
I worry that future maintainers aren't guaranteed to share your ideals. How open is Open Solaris these days? MySQL? OpenOffice?
I think the development community would end up in a much stronger position if all of these systems started with an option for self-hosted, domain based attestations. What's more trustworthy in your mind; 1) this package was built and published by ublockorigin.com or 2) this package was built and published by GitHub Actions?
Can an impersonator gain trust by publishing via GitHub actions? What do the uninformed masses trust more? 1) an un-attested package from gorhill/uBlock, which is a user without a verified URL, etc. or 2) an attested package from ublockofficial/ublockofficial, which could be set up as an organization with ublockofficial.com as a verified URL?
I know uBlock Origin has nothing to do with PyPI, but it's the best example to make my point. The point being that attesting to a build tool-chain that happens to be run by a non-verifying identity provider doesn't solve all the problems related to identity, impersonation, etc.. At worst, it provides a false sense of trust because an attested package sounds like it's trustworthy, but it doesn't do anything to verify the trustworthiness of the source, does it?
I guess I think the term "Trusted Publisher" is wrong. Who's the publisher of uBlock Origin? Is it GitHub Actions or gorhill or Raymond Hill or ublockorigin.com? As a user, I would prefer to see an attestation from ublockorigin.com if I'm concerned about trustworthiness and only get to see one attestation. I know who that is, I trust them, and I don't care as much about the technology they're using behind the scenes to deliver the product because they have a proven track record of being trustworthy.
That said, I do agree with your point about gaining popularity and compromises that developers without an existing reputation may need to make. In those cases, I like the idea of having the option of getting a platform attestation since it adds some trustworthiness to the supply chain, but I don't think it should be labelled as more than that and I think it works better as one of many attestations where additional attestations could be used to provide better guarantees around identity.
Skimming the provenance link [1] in the docs, it says:
> It’s the verifiable information about software artifacts describing where, when and how something was produced.
Isn't who is responsible for an artifact the most important thing? Bad actors can use the same platforms and tooling as everyone else, so, while I agree that platform attestations are useful, I don't understand how they're much more than a verified (ex) "Built using GitHub" stamp.
To be clear, I think it's useful, but I hope it doesn't get mistakenly used as a way of automatically assuming project owners are trustworthy. It's also possible I've completely misunderstood the goals since I usually do better at evaluating things if I can try them and I don't publish anything to PyPI.
Since roughly 2014 there have been so many rug pulls in the Python organization that no one believes anything any more.
It always starts small: "Oh, we just want everyone to be nice, so we have this super nice CoC document. Anyone who distrusts us is malicious."
Ten years later you have a corporate-backed inner circle that abuses the CoC to silence people, extend their power and earning potential and resort to defamation and libel.
It is possible that the people here who defend this corporate attestation-of-provenance-preferably-on-our-infrastructure scheme think that nothing nefarious will happen in the future. Well, given the repressive history of Python they are naive then.
It just takes pip to add a flag to enable "non-attested" packages. And of course they'll name it something like --allow-insecure-potentially-malicious.
They’re not encouraging open infrastructure. In fact, this whole design doesn’t even contemplate the possibility of self hosting. Trust must be blindly delegated to one of the existing partners.
> but as constructive feedback I would recommend softening the language and to replace:
I can soften it, but I think you're reading it excessively negatively: that warning is there to make sure people don't try to do the fiddly, error-prone cryptographic bits if they don't need to. It's a numerical fact that most project owners don't need that section, since most are either using manual API tokens or are publishing via GitHub Actions.
> Perhaps also add a few of the providers you listed as well?
They'll be added when they're enabled. Like I said in the original comment, we're using a similar enablement pattern as happened with Trusted Publishing: GitHub was enabled first because it represents the majority of publishing traffic, followed by GitLab and the others.
> GitHub being popular is a self-reinforcing process, if GitHub is your first class citizen for something as crucial as trusted publishing then projects on GitHub will see a higher adoption and become the de-facto "secure choice".
I agree, but I don't think this is PyPI's problem to solve. From a security perspective, PyPI should prioritize the platforms where the traffic is.
(I'll note that GitLab has been supported by Trusted Publishing for a while now, and they could make the publishing workflow more of a first class citizen, the way it is on GHA.)
> I agree, but I don't think this is PyPI's problem to solve. From a security perspective, PyPI should prioritize the platforms where the traffic is.
To me that's a bit of a weird statement, PyPI is part of the Python foundation, making sure that the project remains true to its open-source nature is reasonable?
My concern is that these type of things ultimately play out as "we are doing the right thing to limit supply chain attacks" which is good an defendable, but in ~5 years PyPI will have an announcement that they are sunsetting PyPI package upload in favor of the trusted provider system. pip (or other tooling) will add warnings whenever I install a package that is not "trusted". Maybe I am simply pessimistic.
That being said we can agree to disagree, I am not part of the PSF and I did preface my first comment with "I guess I am an idealist".
> making sure that the project remains true to its open-source nature is reasonable?
What about this, in your estimation, undermines the open-source nature of PyPI? Nothing about this is proprietary, and I can't think of any sane definition of OSS in which PyPI choosing to verify OIDC tokens from GitHub (among other IdPs!) meaningfully subverts PyPI's OSS committment.
> PyPI package upload in favor of the trusted provider system. pip (or other tooling) will add warnings whenever I install a package that is not "trusted". Maybe I am simply pessimistic.
Let me put it this way: if PyPI disables API tokens in favor of mandatory Trusted Publishing, I will eat my shoe on a livestream.
(I was the one of the engineers for both API tokens and Trusted Publishing on PyPI. They're complementary, and neither can replace the other.)
> Absence of support for self-hosting, or for that matter for any non-proprietary service?
This has nothing to do with self-hosting, whatsoever. You can upload to PyPI with an API token; that will always work and will not do anything related to Trusted Publishing, which exists entirely because it makes sense for large services.
PyPI isn't required to federate with the server in my basement through OpenID Connect to be considered open source.
I believe you that token uploads will continue to be possible, but it seems likely that in a couple of years trusted publishing & attestations will be effectively required for all but the tiniest project. You'll get issues and PRs to publish this way, and either you accept them, or you have to repeatedly justify what you've got against security.
And maybe that's a good thing? I'm not against security, and supply chain attacks are real. But it's still kind of sad that the amazing machines we all own are more and more just portals to the 'trusted' corporate clouds. And I think there are things that could be done to improve security with local uploads, but all the effort seems to go into the cloud path.
> I believe you that token uploads will continue to be possible, but it seems likely that in a couple of years trusted publishing & attestations will be effectively required for all but the tiniest project.
That's what I think will happen.
> And maybe that's a good thing? I'm not against security, and supply chain attacks are real.
The problem is the attestation is only for part of the supply chain. You can say "this artifact was built with GitHub Actions" and that's it.
If I'm using Gitea and Drone or self-hosted GitLab, I'm not going to get trusted publisher attestations even though I stick to best practices everywhere.
Contrast that with someone that runs as admin on the same PC they use for pirating software, has a passwordless GPG key that signs all their commits, and pushes to GitHub (Actions) for builds and deployments. That person will have more "verified" badges than me and, because of that, would out-compete me if we had similar looking projects.
The point being that knowing how part of the supply chain works isn't sufficient. Security considerations need to start the second your finger touches the power button on your PC. The build tool at the end of the development process is the tip of the iceberg and shouldn't be relied on as a primary indicator of trust. It can definitely be part of it, but only a small part IMO.
The only way a trusted publisher (aka platform) can reliably attest to the security of the supply chain is if they have complete control over your development environment which would include a boot-locked PC without admin rights, forced MFA with a trustworthy (aka their) authenticator, and development happening 100% on their cloud platform or with tools that come off a safe-list.
Even if everyone gets onboard with that idea it's not going to stop bad actors. It'll be exactly the same as bad actors setting up companies and buying EV code signing certificates. Anyone with enough money to buy into the platform will immediately be viewed with a baseline of trust that isn't justified.
As I understand it, the point of these attestations is that you can see what goes into a build on GitHub - if you look at the recorded commit on the recorded repo, you can be confident that the packages are made from that (unless your threat model is GitHub itself doing a supply chain attack). And the flip side of that is that if attestations become the norm, it's harder to slip malicious code into a package without it being noticed.
That's not everything, but it is a pretty big step. I don't love the way it reinforces dependence on a few big platforms, but I also don't have a great alternative to suggest.
Yeah, if the commit record acts like an audit log I think there’s a lot of value. I wonder how hard it is to get the exact environment used to build an artifact.
I’m a big fan of this style [1] of building base containers and think that keeping the container where you’ve stacked 4 layers (up to resources) makes sense. Call it a build container and keep it forever.
Thank you for being the first person to make a non-conspiratorial argument here! I agree with your estimation: PyPI is not going to mandate this, but it’s possible that there will be social pressure from individual package consumers to adopt attestations.
This is an unfortunate double effect, and one that I’m aware of. That’s why the emphasis has been on enabling them by default for as many people as possible.
I also agree about the need for a local/self-hosted story. We’ve been thinking about how to enable similar attestations with email and domain identities, since PyPI does or could have the ability to verify both.
If there is time for someone to work on local uploads, a good starting point would be a nicer workflow for uploading with 2FA. At present you either have to store a long lived token somewhere to use for many uploads, and risk that it is stolen, or fiddle about creating & then removing a token to use for each release.
> or you have to repeatedly justify what you've got against security.
The only reason I started using PyPI was because I had a package on my website that someone else uploaded to PyPI, and I started getting support questions about it. The person did transfer control over to me - he was just trying to be helpful.
I stopped caring about PyPI with the 2FA requirement since I only have one device - my laptop - while they seem to expect that everyone is willing to buy a hardware device or has a smartphone, and I frankly don't care enough to figure it out since I didn't want to be there in the first place and no one paid me enough to care.
Which means there is a security issue whenever I make a new package available only on my website should someone decide to upload it to PyPI, perhaps along with a certain something extra, since people seem to think PyPI is authoritative and doesn't need checking.
The 2FA requirement doesn't need a smartphone. You can generate the same one time passwords on a laptop. I know Bitwarden has this functionality, and there are other apps out there if that's not your cup of tea. Sorry that you feel pressured, but it is significantly easier to express a dependency on a package if it's on PyPI than a download on your own site.
Sure. But PyPI provides zero details on the process, I don't use 2FA for anything else in my life, no one is paying me to care, I find making PyPI releases tedious because I inevitably make mistakes in my release process, I have a strong aversion to centralization and dependencies[1][2].
I tell people to "pip install -i $MY_SITE $MY_PACKAGE". I can tell from my download logs that this is open to dependency confusion attacks as I can see all the 404s from attempts to, for example, install NumPy from my server. To be clear, the switch to 2FA was only the triggering straw - I was already migrating my packages off of PyPI.
Finally, I sell a source license for a commercial product (which is not the one which got me started with PyPI). My customers install it via their internally-hosted PyPI mirrors.
I provide a binary package with a license manager for evaluation purposes, and as a marketing promotion. As such, I really want them to come to my web site, see the documentation and licensing options, and contact me. I think making it easier to express as a dependency via PyPI does not help my sales, and actually believe the extra intermediation likely hinders my sales.
[1] I dislike dependencies so much that I figured out how to make a PEP 517 compatible version that doesn't need to contact PyPI simply to install a local package. Clearly I will not become a Rust developer.
[2] PyPI support depends on GitHub issues. I regard Microsoft as a deeply immoral company, and a threat to personal and national data sovereignty, which means I will not sign up for a GitHub account. When MS provides IT support for the upcoming forced mass deportations, I will have already walked away from Omelas.
In short, use "backend-path" to include a subdirectory which contains your local copies of setuptools, wheel, etc. Create a file with the build hooks appropriate for "backend-path". Have that those hooks import the actual hooks in setuptools. Finally, set your "requires" to [].
Doing this means taking on a support burden of maintaining setuptools, wheels, etc. yourself. You'll also need to include their copyright statements in any distribution, even though the installed code doesn't use them.
(As I recall, that "etc" is hiding some effort to track down and install the full list of packages dragged in, but right now I don't have ready access to that code base.)
>> PyPI package upload in favor of the trusted provider system. pip (or other tooling) will add warnings whenever I install a package that is not "trusted". Maybe I am simply pessimistic.
> Let me put it this way: if PyPI disables API tokens in favor of mandatory Trusted Publishing, I will eat my shoe on a livestream.
Yeah, sure. But parent poster also mentioned pip changes. Will you commit to eating a shoe if pip verifies attestations? Of course not, you know better than I do that those changes to pip are in the works and experimental implementations already available. You must have forgotten to mention that. PyPI doesn't have to be the bad guy on making sure its a PITA to use those packages. Insert Futurama "technically correct, the best kind of correct" meme.
Insinuating dishonesty is rude, especially when it’s baseless: the post linked in this one is explicit about experimenting with verification. There’s no point in signing without verification.
pip may or may not verify attestations; I don’t control what they do. But even if they do, they will never be able to mandate them for every project: as I have repeatedly said, there is no plan (much less desire) to mandate attestation generation.
Regardless of your intent, you didn't acknowledging the parent poster's concern about python overall. Just exculpated PyPI.
I believe you would have received a much less pushback if you hadn't been coy about the obviously impending quiet deprecation (for lack of a better phrase) you finally acknowledged elsewhere in the thread.
I was reaching for a descriptor and ended up riffing off 'quiet quitting' and 'quiet firing'.
I understand you're not deprecating or recommending anything, and have learned in the meantime just how ugly python infighting is in and around package tooling. I can see the motivation to keep external messaging limited to saying a feature was added and everything else remains constant.
I work with a bunch of normies that think python packing starts and ends at "pip install"-ing systemwide on Windows. One day in the future the maintainers of the packages they use will likely be encouraged to use this optional new feature (and/or they already have been publicly). Even later in the future those end users themselves might be encouraged by warnings or errors not to install packages without this extra feature.
PyPI has deprecated nothing. Bureaucrat Conrad, you are technically correct. The best kind of correct.
> I understand you're not deprecating or recommending anything, and have learned in the meantime just how ugly python infighting is in and around package tooling. I can see the motivation to keep external messaging limited to saying a feature was added and everything else remains constant.
You're insinuating an ulterior motive. Please don't do that; assume good faith[1].
(You can find trivial counterexamples that falsify this: PyPI was loud and explicit about deprecating and disabling PGP; there's no reason to believe it would be any less loud and explicit about a change here.)
I'm with @belval on this one, it's ok to prioritize github, but people that want the standard to implement an alternative should not feel like they are doing something that may not be supported.
Again, to be clear: the standard does not stipulate GitHub or any other specific identity providers. The plan is to enable GitLab and the other Trusted Publisher providers in short order.
This is exactly the same as Trusted Publishing, where people accused the feature of being a MSFT trojan horse because GitHub was enabled first. I think it would behoove everybody to assume the best intentions here and remember that the goal is to secure the most people by default.
It's said explicitly in the second sentence in the usage docs[1].
> Attestations are currently only supported when uploading with Trusted Publishing, and currently only with GitHub-based Trusted Publishers. Support for other Trusted Publishers is planned. See #17001 for additional information.
At this point it should be fairly obvious that if you have to defend the phrasing in multiple threads here on HN, get some folks to help rephrase the current document instead so you can comment with "we updated the text to make it clear this is a first pass and more options are getting added to the doc soon".
If you draw an ugly cat, and someone tells you it's ugly, it doesn't matter how much you insist that it isn't, and the same is true for docs. It doesn't matter what your intention was: if people keep falling over the same phrasing, just rephrase it. You're not your writing, it's just text to help support your product, and if that text is causing problems just change it (with the help of some reviewers, because it's clear you think this is phrased well enough, but you're not the target audience for this document, and the target audience is complaining).
Anyone can run an OIDC system if they want. But PyPI is not under an obligation to trust an OIDC provider running on a random rpi3 in your basement. More than that, GitHub is "trusted" because we can be pretty sure they have an on-call staff to handle incidents, that they can reliably say "This token was provided on behalf of this user at this time for this build", etc.
Even if you standardized the more technical parts like OIDC claim metadata (which is 100% provider specific), it wouldn't really change the thrust of any of this — PyPI is literally trusting the publisher in a social sense, not in some "They comply with RFC standards and therefore I can plug in my random favorite thing" sense.
This whole discussion is basically a non-issue, IMO. If you want to publish stuff from your super-duper-secret underground airgapped base buried a mile underneath the Himalayas, you can use an API token like you have been able to. It will be far less hassle than running your own OIDC solution for this stuff.
Please improve your reading comprehension. I swear, this website is embarassing sometimes. You can still do this with an API Token. You can upload from a C64 with an API token. What you cannot do is run some random OIDC provider on your random useless domain and have PyPI magically respect it and include as part of the Trusted Publishers program. There is no point in it, because the program itself is constrained by design because it only provides any benefit at "large scale." Your random dumb server providing a login for you alone does not provide any benefits over you just using an API Token.
Any pathway to provide trusted attestations for random individual Hacker News users like yourself will, in fact, require a different design.
I think you're missing something. The key in question is a short-lived ECDSA key that lives inside a publishing workflow and is destroyed after signing; neither GitHub nor the Sigstore CA generates a signing key for you.
PyPI will accept any key bound to an identity, provided we know how to verify that identity. Right now that means we accept Trusted Publishing identities, and GitHub identities in particular, since that's where the overwhelming majority of Python package publishing traffic comes from. Like what happened Trusted Publishing, this will be expanded to other identities (like GitLab repositories) as we roll it out.
How does pypi know I'm not github? Because I can sign with my keys and not with github's key.
Never mind all the low level details of the temporary keys and hashes and all of that. This is an high level comment not a university book about security.
It requires an OIDC IdP, though… with PGP, I can verify that my identity is constant, but now, I’m reliant on some chosen third-party. And it has to keep being the same one, to boot! I don’t like the lock-in this causes, and I definitely don’t like the possibility of my right to publish being revoked by a third party.
I’m only now learning about what OIDC IdP is (for those like me: openid connect identity provider). But from my reading, a self hosted gitlab can function as an oidc idp.
You can't use a self-hosted Gitlab because you can only use a "trusted publisher".
There's no hard technical reason for that. It's mostly that PyPI only want to trust certain issuers who they think will look after their signing keys responsibly.
There is a technical reason for it, and it’s explained in an adjacent thread. Accepting every single small-scale IdP would result in a strictly worse security posture for PyPI as a whole, with no actual benefit to small instances (who are better off provisioning API tokens the normal way instead of using Trusted Publishing).
I don't think that's a technical reason, that's more of a social reason about trust. Adding configurable JWT issuers is not technically hard or difficult to do in a way that doesn't compromise PyPI's inherent security. There's a configuration difficulty, but those using it will generally know what they're doing and it only compromises their own packages (which they can already compromise by losing tokens etc.)
I disagree with your assessment that provisioning API tokens is better than being able to authenticate with a JWT. It makes managing credentials in an organisation much easier as far fewer people need access to the signing key compared to who would need access to the API token. Using asymmetric keys also means there's less opportunity for keys to be leaked.
Adding another IdP is not the hard part; establishing the set of claims that IdP should be presenting to interoperate with PyPI is. The technical/social distinction is specious in this context: the technical aspects are hard because the social aspects are hard, and vice versa.
If you work in a large organization that has the ability to maintain a PKI for OIDC, you should open an issue on PyPI discussing its possible inclusion as a supported provider. The docs[1] explain the process and requirements.
> who are better off provisioning API tokens the normal way
As long as those packages get digital attestation, perhaps attested by PyPI itself post-upload or from a well-known user provided key similar to how GPG worked but managed by PyPI this time.
Surely you see how this is creating two classes of packages, where the favored one requires you use a blessed 3rd party?
No, I don't. There's no plan, and there will be no plan, to make installers reject packages that don't have attestations. Doing so is technically and socially impossible, and undesirable to boot.
The strongest possible version of this is that projects that do provide attestations will be checked by installers for changes in identity (or regression in attestations). In other words, this feature will only affect packages that opt to provide attestations. And even that is uncertain, since Python packaging is devolved and neither I (nor anybody else involved in this effort) has literally any ability to force installers to do anything.
It comes across like the goal of this system is to prove to my users a) who I am; b) that I am working in cooperation with some legitimate big business. I don't understand why I should want to prove such things, nor why it's desirable for b) to even be true, nor why my users should particularly care about a) at all, whether I can prove it or not.
I think my users should only care that the actual contents of the sdist match, and the built contents of the wheel correspond to, the source available at the location described in the README.
Yes, byte-for-byte reproducible builds would be a problem, for those who don't develop pure Python. Part of why sdists exist is so that people who can't trust the wheel, like distro maintainers, can build things themselves. And really - why would a distro maintainer take I take "I can cryptographically prove the source came from zahlman, and trust me, this wheel corresponds to the source" from $big_company any more seriously than "trust me, this wheel corresponds to the source" directly from me?
If I have to route my code through one of these companies ("Don't worry, you can work with Google instead!" is not a good response to people who don't want to work with Microsoft) to gain a compliance checkmark, and figure out how someone else's CI system works (for many years it was perfectly possible for me to write thousands of lines of code and never even have an awareness that there is such a thing as CI) so that I can actually use the publishing system, and learn what things like "OIDC IdP" are so that I can talk to the people in charge...
... then I might as well learn the internal details of how it works.
And, in fact, if those people are suggesting that I shouldn't worry about such details, I count that as a red flag.
There's two alternate visions: anybody can work with anybody they trust, and Only BigCo Can Protect You.
We know the first falls apart because the Web of Trust fell apart. When the UI asks people "hey, this is new, here's the fingerprint, do you trust it?", nobody actually conducts review, they just mindlessly click "yes" and move on. Very, very, very few people will actually conduct full review of all changes. Even massive organizations that want to lie to themselves that they review the changes, end up just outsourcing the responsibility to someone who is willing to legally accept the risk then they go ahead and just mindlessly click "yes" anyway. There are just too many changes to keep track of and too many ways to obscure malicious code.
We know the second falls apart because human beings are both the inevitable and ultimate weakest link in even the robustest security architectures. Human beings will be compromised and the blast radius on these highly centralized architectures will be immense.
> And, in fact, if those people are suggesting that I shouldn't worry about such details, I count that as a red flag.
Nobody is suggesting that. There's an annotation on the internals documentation to make sure that people land on the happy path when it makes sense for them. If you want to learn about the internals of how this works, I heartily encourage you to. I'd even be happy to discuss it over a call, or chat, or any other medium you'd like: none of this is intended to be cloak-and-dagger.
(You can reach this conclusion abductively: if any of this was intended to be opaque, why would we write a public standard, or public documentation, or a bunch of high-profile public announcements for it?)
> Part of why sdists exist is so that people who can't trust the wheel, like distro maintainers, can build things themselves.
Some might even expect pypi to build everything going into it's repo similar to a distro building for it's repos. As far as I can tell, all this 'attestation' is just pypi making microsoft pay for the compute time with extra steps.
Except that also for trusted publishing, they only allowed github in the beginning and eventually added a couple of other providers. But if you're not google or microsoft you won't be added.
These kinds of comments are borderline mendacious: you can observe, trivially, that 50% of the Trusted Publishers currently known to PyPI are neither Google nor Microsoft controlled[1].
If PyPI accepts two more likely ones, a full 2/3rds will unrelated to GitHub.
Once again: this is constrained by design. If you don’t want to use OpenID Connect, just create a token on PyPI and publish the normal way. You are not, and will never be, required to use this feature.
Because the security benefit of Trusted Publishing via OIDC versus normal API tokens is marginal at small scales, in two senses:
1. The primary benefit of Trusted Publishing over a manual API token is knowing that the underlying OIDC IdP has an on-call staff, proper key management and rotation policies, etc. These can be guaranteed for GitHub, GitLab, etc., but they're harder to prove for one-off self-hosted CI setups. For the latter case, the user is no better off than they would be with a manual API token, which is still (and will always be) supported.
2. If the overwhelming majority of traffic comes from a single CI/CD provider, adding more code to support generic OIDC IdPs increases PyPI's attack surface for only marginal user benefit.
There also is no "open interface" for PyPI to really use here: this is all built on OIDC, but each OIDC provider needs to have its unique claims mapped to something intelligible by PyPI. That step requires thoughtful, manual, per-IdP consideration to avoid security issues.
I still think this is overly strict. Supporting arbitrary OIDC providers is not excessively complex or particularly rare, the major cloud providers all support it in one way or another [1][2][3], as does Hashicorp Vault [4]. I disagree that the primary benefit over a manual API token is _knowing_ that the OIDC IdP is following the best practices you talk about. Having it rely on asymmetric keys makes the process more secure and scalable than API tokens for those that choose to use it.
I think there's a separate question around trust. But I think blocking non-trusted publishers from using a more secure form of authentication isn't the answer. Instead I think it makes more sense to use nudges in the PyPI UI and eventually of consumers (e.g. pip) to indicate that packages have come from non-trusted publishers.
I can create a github account. How does that make me trustworthy? How being able to create a github account prevents me from adding a virus in my module?
It proves that a package was definitely uploaded from the correct repo.
Without trusted publishers a nefarious actor could use a leaked PyPI API key to upload from anywhere. If the only authorised location is actions on a specific Github repo then it makes a supply chain attack much trickier and much more visible.
With the new attestations it's now possible for package consumers to verify where the package came from too.
You can no longer upload a PGP signature to PyPI, if that's what you mean. That was phased out last year (to virtually no complaint since nobody was actually verifying any of the signatures, much less attempting to confirm that their keys were discoverable[1]).
> to virtually no complaint since nobody was actually verifying any of the signatures
And this is in no way a consequence of pypi stopping to host public keys right? Say the whole story at least… Say that there used to be a way to verify the signatures but you dropped it years ago and since then the signatures have been useless.
If it did, it was well before I ever began to work on PyPI. By the time I came around, PGP signature support was vestigial twice over and the public key discovery network on the Internet was rapidly imploding.
(But also: having PyPI be the keyserver defeats the point, since PyPI could then trivially replace my package's key. If that's the "whole story," it's not a very good one.)
This attestation doesn’t change a ton with that, though. The point is to provide chain of custody — it got to my computer, from pypi, from ???. The PGP signature, much like a self-signed android app, verifies that it continues to be the same person.
The critical difference with this architecture is that it doesn’t require key discovery or identity mapping: those are properties of the key infrastructure, similarly to the Web PKI.
Your self-signed app analogy is apt: self-signing without a strong claimant identity proof is a solution, but a much weaker one than we wanted.
I am not sure why my comment above is downvoted -- if you know where the perpetual optionality of digital attestations is officially stated, please, provide a link.
Because every CI/ID provider has a different set of claims and behaviors that would constitute a "secure" policy for verification. If there was one singular way to do that then we could, but there isn't yet so PyPI needs to onboard providers piecemeal. The work to add a new provider is not massive, the reason there are not tons of providers isn't because the work is hard but rather because people are voting with their feet so Github and Gitlab make sense as initial providers to support.
Even more, the previous way was to use GPG signatures, which were recently deprecated and removed. So you don't really have a choice.
>Where the only official workflow is "Use GitHub Actions".
Well you can do it manually with other solutions... as long as they are one of the four trusted publishers (see "Producing attestations manually does not bypass (...) restrictions on (...) Trusted Publishers":
On the one hand, you are totally right, GitHub Actions are the VS Code of automation. People choose them because they are broke, they work well enough, and the average person chooses things based on how it looks. GitHub Actions looks easy, it's cozy, it's comfy, VS Code looks like a code editor, everything has to be cozy and comfy.
On the other hand, considering all of that, you can see why Python would arrive at this design. They are the only other people besides NPM who regularly have supply chain attack problems. They are seemingly completely unopinionated about how to fix their supply chain problems, while being opinionated about the lack of opinions about packaging in general. What is the end goal? Presumably one of the giant media companies, Meta, Google or Microsoft maybe, have to take over the development of runtime and PEP process, but does that even sound good?
I think the end goal is to only allow github/google whatever accounts to publish code on Pypi, so importing whatever will be USA-sanctions safe.
The burden to ban russians/koreans/iranians will be on those big companies and pypi will be able to claim they respect the rules without having the resources themselves to ban accounts from sanctioned countries.
So in my opinion, in the future tooling will have some option to check if the software was uploaded via microsoft et al and can be considered unsanctioned in USA.
Or they might just say that's how you upload now and that's it.
None of this is true. There is no plan to disable API token uploads to PyPI or to mandate Trusted Publishing, much less attestations. It's entirely intended to be a misuse-resistant way to upload to the index in the "happy case" where the package uses a CI/CD workflow that supports OpenID Connect.
(I'm disappointed by the degree to which people seem to gleefully conspiracize about Python packaging, and consequently uncritically accept the worse possible interpretation of anything done to try and improve the ecosystem.)
>(I'm disappointed by the degree to which people seem to gleefully conspiracize about Python packaging, and consequently uncritically accept the worse possible interpretation of anything done to try and improve the ecosystem.)
I sympathize, but there are obvious reasons for this: low trust in the Python packaging system generally (both because of recent PSF activity and because of the history of complications and long-standing problems), and the fact that many similar things have been seen in the Linux and open-source worlds lately (for one example, Linux Mint's new stance on Flatpak and how they present Flatpak options in the Software Manager). (Edit: it also kinda rhymes with the whole "Secure Boot" thing and how much trouble it causes for Linux installation.)
I personally expect people to have better compartmentalization skills than this. It's both unreasonable and inappropriate to have a general chip on your shoulder, and consequently assume a negative interpretation of an unrelated security effort.
It happens that people don't trust you if you are being obviously dishonest.
I still have to understand how the github credentials stored on my computer are harder to steal than the pypi credentials stored on the very same computer.
If you can explain this convincingly, maybe I'll start to believe some of the things you claim.
> I still have to understand how the github credentials stored on my computer are harder to steal than the pypi credentials stored on the very same computer.
This is not, and has never been, the argument for Trusted Publishing. The argument is that temporary, automatically scoped credentials are less dangerous than permanent, user scoped credentials, and that an attacker who does steal them will struggle to maintain persistence or pivot to other scoped projects.
The documentation is explicit[1] about needing to secure your CI workflows, and treating them as equivalent to long-lived API tokens in terms of security practices. Using a Trusted Publisher does not absolve you of your security obligations; it only reduces the scope of failures within those obligations.
Indeed. I have to wonder: if being able to authenticate to GitHub, upload one's code there, and then coordinate a transfer from there to PyPI through the approved mechanisms is good enough... then why can't PyPI just use the same authentication method GitHub does?
It seems hard to justify "we want to ensure the build is reproducible" when there is no system to attempt the build on PyPI's end and the default installer is perfectly happy to attempt the build (when no prebuilt version is available) on the user's machine - orchestrated by arbitrary code, and without confirmation as a default - without a chance to review the downloaded source first.
It seems hard to justify "we want to limit the scope of the authentication" when "a few minutes" is more than enough time for someone who somehow MITMed a major package to insert pre-prepared malware, and access to a single package from a major project affects far more machines than access to every repo of some random unimportant developer.
(The wheel format has hashes, but the sdist format doesn't. If these sorts of attacks aren't meaningfully mitigated by wheel hashes, what is the RECORD file actually accomplishing? If they are, shouldn't sdists have them too?)
except that is impractical at best. The Debian (and Ubuntu) python packaging groups have been overwhelmed since Python2.7. no way at all does the Debian packaging ecosystem reasonably satisfy python packages.. And there is more.. the Debian system does what it must do -- ensure system Python has integrity with major software stacks and the OS use of Python. Super! that is not at all the needs of users of python. As you know, Python has passed Javascript as the most used language on Github. There is tremendous pace and breadth to several important Python uses, with their libraries.
Please reconsider this position with Debian and python packaging.
Does a .deb file even allow for installing for an arbitrary Python (i.e. not the system installation)? Am I supposed to find them through Apt and hope the Debian version of the package has a `python3` prefix on the name? Is that going to help if there are dependencies outside the Apt system?
I think you're being overly critical. When it says that it adds support for Trusted Publishers, it links directly to this page: https://docs.pypi.org/trusted-publishers/.
This page clearly explains how this uses OIDC, and uses GitHub Actions as an example. At no point in my read did I feel like this was shilling me a microsoft product.
I might be reading this wrong but doesn't it include four workflows, including Gitlab? The "manual way" section is grouped under the Github tab, which indicates to me that it's only explaining how the Github-specific workflow works?
> Support for automatic attestation generation and publication from other Trusted Publisher environments is planned. While not recommended, maintainers can also manually generate and publish attestations.
You're on a slightly different page: Trusted Publishing supports four different providers, but this initial enablement of attestations (which are built on top of Trusted Publishing) is for GitHub only. The plan is to enable the others in short order.
I think y’all are being overly aggressive against what is from an outside perspective looks like a mvp targeting the majority use case.
If the dev’s lower in the comments claim that other support is soon to be added doesn’t pan out then sure this is a valid argument, but it feels very premature and ignores the reality of launch expectations of most agile based teams.
While I understand that, are more deep problem is that their main competitor (gut lab) is too busy to implement the fad of the day. Got labs cicd is already an unholy mess of scripting in yaml, and there is no good way to provide encapsulated pieces of code to gitlab. GitHub has actions which have a defined input and output, gitlab has nothing comparable. That makes it very difficult to provide functionality for gitlab.
Is it really that hard to provide a gitlab workflow?
AFAIU it'd be enough to have a docker image and use that in a standalone workflow which is a few lines of yaml in both GitHub and gitlab, what'd be the problem with that?
I'm not really convinced of the value of such attestations until a second party can reproduce the build themselves on their own hardware.
Putting aside the fact that the mechanisms underpinning Github Actions are a mystery black box, the vast vast vast majority of github workflows are not built in a reproducible way - it's not even something that's encouraged by Github Actions' architecture, which emphasises Actions' container images that are little more than packaged installer scripts that go and download dependencies from random parts of the internet at runtime. An "attestation" makes no guarantee that one of these randomly fetched dependencies hasn't been usurped.
This is not to go into the poor security record of Github Actions' permissions model, which has brought us all a number of "oh shit" moments.
Fully reproducible builds would of course be nicer from a security standpoint, but attestations have vastly lower implementation costs and scale much better while still raising the bar meaningfully.
JFrog also discovered multiple malicious package exploits later.
Now we get a Github centric new buzzword that could be replaced by trusted SHA256 sums. Python is also big on business speak like SBOM. The above key leak of course occurred after all these new security "experts" manifested themselves out of nowhere.
The procedure remains the same. Download a package from the original creators, audit it, use a local repo and block PyPI.
Hello! I believe I'm one of the "manifested" security experts you're hinting at :)
Good security doesn't demand perfection, that's why security is both prevention and preparedness. The response from our admin was in every way beyond what you'd expect from many other orgs: prompt response (on the order of minutes), full audit of activity for the credential (none found), and full public disclosure ahead of the security researcher's report.
> JFrog also discovered multiple malicious package exploits later.
If you're referencing malicious packages on PyPI then yes! We want to keep PyPI freely open to all to use, and that has negative knock-on effects. Turns out that exploiting public good code repositories is quite popular, but I maintain that the impact is quite low and that our ability to respond to these sorts of attacks is also very good due to our network of security engineers and volunteers who are triaging their reports. Thanks to the work of Mike Fiedler (quarantining packages, API for reporting malicious packages, better UI for triagers) our ability to respond to malicious packages will become even better.
> Now we get a Github centric new buzzword that could be replaced by trusted SHA256 sums.
In a way, this feature is what you're describing but is easier to automate (therefore: good for you as a user) and is more likely to be correct because every attestation is verified by PyPI before it's made available to others (which is also good for users). The focus on GitHub Actions is because this is where many Python projects publish from, there is little incentive to create a feature that no one will use.
> Python is also big on business speak like SBOM.
Indeed, there is legislation in many places that will require SBOMs for all software placed in their markets so there is plenty of interest in these standards. I'm working on this myself to try to do the most we can for users while minimizing the impact this will have on upstream open source project maintainers.
>In a way, this feature is what you're describing but is easier to automate (therefore: good for you as a user) and is more likely to be correct because every attestation is verified by PyPI before it's made available to others (which is also good for users).
How long should I expect it to take until I can automatically generate an attestation from `twine`? Or does someone else have to sign off on it through some OpenID mumbo-jumbo before I can qualify as "trusted"?
Automating the creation of SBOMs sounds even further out, since we're still struggling with actually just building sdists in the first place.
Don't download packages from PyPI, go upstream to the actual source code on GitHub, audit that, build locally, verify your build is the same as the PyPI one, check the audits people have posted using crev, decide if you trust any of them, upload your audit to crev too.
After reading the underlying report (https://jfrog.com/blog/leaked-pypi-secret-token-revealed-in-...), I can't help but think: "where is the defense in depth?" Since `.pyc` files are just a cache of compilation that's already generally pretty quick, this could have been prevented by systems that simply didn't allow for pushing them into the Docker image in the first place. Or by having `PYTHONDONTWRITEBYTECODE=1` set on the developer's machine.
(Also, now I'm trying to wrap my head around the fact that there's such a thing as "Docker Hub" in the first place, and that people feel comfortable using it.)
> now I'm trying to wrap my head around the fact that there's such a thing as "Docker Hub" in the first place
Unless you build all of your images `FROM scratch` by default (or use in-house registries or quay or whatnot for all of your base images), you've used Docker Hub too...
> Reliability & notability: The effort necessary to integrate with a new Trusted Publisher is not exceptional, but not trivial either. In the interest of making the best use of PyPI's finite resources, we only plan to support platforms that have a reasonable level of usage among PyPI users for publishing. Additionally, we have high standards for overall reliability and security in the operation of a supported Identity Provider: in practice, this means that a home-grown or personal use IdP will not be eligible.
From what I can tell, this means using self-hosted Github/Gitlab/Gitea/whatever instances is explicitly not supported for most projects. I can't really tell what "a reasonable level of usage among PyPI users" means (does the invent.kde.org qualify? gitlab.gnome.org? gitlab.postmarketos.org?) but I feel like this means "Gitlab.com and maybe gitea.com" based on my original reading.
Of course by definition PyPI is already a massive single point of failure, but the focus on a few (corporate) partners to allow secure publishing feels like a mistake to me. I'm not sure what part of the cryptographic verification process restricts this API to the use of a few specific cloud providers.
Maybe there is a case to be made for STF to fund making Codeberg (a German-headquartered organization) one of the PyPI trusted hosts. If Codeberg were supported, that would go a long way to addressing fears. And conversely, if Codeberg can't meet PyPI's bar, that suggests complete commercial capture.
Thanks for pointing this out! In general, the sovereign tech fund should not fund Python or the PSF, since the administration is U.S. dominated and the whole organization is captured by U.S. corporations.
Unpaid developers from Europe (and Asia) just serve as disposable work horses for the U.S. "elites".
Supply chain security is very important, and this seems like an important step. Seems absolutely essential that something like the Mozilla foundation, or EFF, or some other open-source friendly entity help provide such a service, instead of corralling users into companies with exploitative business models.
I am in no hurry to be pushed into using Github, Gitlab or whatever else. Programmer's open source code has been monetized by these companies to feed AI LLM beasties, and it's fundamentally objectionable to me. I self-host my code using Gitea for that reason.
Congratulations to everyone involved in this work! This is an incredible building block for not just supply-chain security both inside and downstream of the PyPI ecosystem but also for data analysis of open source projects (through strong linkage back to source repositories). Thank you all <3
Just a note: that issue is for connecting the already-present attestations to GitHub's attestations feature[1]. Attestations are already automatically generated and uploaded to PyPI regardless of that issue.
(That issue will still be great to resolve, but just to avoid any confusion about its scope.)
The corresponding ToB blog post says the following:
> Longer term, we can do even better: doing “one off” verifications means that the client has no recollection of which identities should be trusted for which distributions. To address this, installation tools need a notion of “trust on first use” for signing identities, meaning that subsequent installations can be halted and inspected by a user if the attesting identity changes (or the package becomes unattested between versions).
Agree: signing is only as good as verification. However, trust-on-first-use (TOFU) is not the most secure way to map packages to attestations because nothing stops attackers who have taken over PyPI from tampering with the _unsigned_ mapping of identities to attestations in package lockfiles for new clients (certainly in containerized environments where everything could look new), and even just new versions of packages.
Although [PEP 458](https://peps.python.org/pep-0458/) is about signing the Python package index, it sets the foundation for being able to securely map packages to signed in-toto _policies_, which would in turn securely map identities to attestations. I think it is worth noting how these different PEPs can work together :)
The root point of this seems to be PyPI does not have the resources to manage user identity, and wants to outsource that component to Github, et al. That sounds fairly reasonable. But why deprecate GPG signatures? The problem with GPG signatures as I understand it is it's difficult to find the associated public key. That's fair. Why not host and allow users to add their public keys to their accounts? Wouldn't that solve the problem?
I'll take your word that GPG is outdated. GPG is just the one that PyPI used to support. I don't particularly care what public key signing suite is used.
Despite crypto getting easier and easier + more main stream. You often don't see it used that often. Even in ((('"```blockchain'"```))) projects (wink emoji; skull emoji; frown; picture of a chart crashing.) I welcome these additions to PyPI. I am a proud Python Chad and always will be.
Yes, it’s bound to a specific commit; we just don’t present that in the web UI yet. If you click on the transparency log entry, you’ll see the exact commit the attestation came from.
But my CI can download and run code from everywhere, so that doesn't mean that I can know what is being uploaded just looking at the git repository alone.
I think since when they have 2FA PyPI is less secure.
Before I could learn my password and type it on twine. If my machine was stolen no upload on pypi was possible.
Now it's a token file on my disk so if my machine is stolen, then token can be used to publish.
Using github to publish doesn't change anything: if my machine is stolen the token needed to publish is still there, but instead of directly to pypi it will need to go via github first.
> Tokens are a problem too (a yubikey might be a solution).
Not the way they implement things. When they started forcing people to use 2FA, google also made a titan keys giveaway. But you set it up on your account, generate a token and that's it.
Identical situation on github. Setup 2FA with an hardware key, then generate a token and never use the hardware key ever again.
I doubt Yubikey would help without some fancy setup. 2FA is required to sign into PyPI but that's it. When PyPI rolled it out I thought you'd have to use 2FA every time you publish. I thought they were taking security seriously. But no, you get your API token, save it to your computer, forget about it, and you can publish your packages forever. Now you can have Github automatically publish your packages. That's not any improvement to security. My Google security key is just collecting dust.
Eh mine too. I got it because I had an essential package or whatever when they started forcing 2FA to people, and I thought twine would require me to authenticate with the key to be authorised to publish.
But they didn't touch twine at all. They just made me create a token and save it in a .txt file. That's it.
I'm now thinking about a system that can enforce that any line of code committed to the Git repo has been on the user's screen for at least X seconds. It could be a system completely isolated from the computer on which code is entered (e.g. via the HDMI cable).
So, let me get this clear, after PyPA enshittifying the package expérience for more than a decade, and now mess has been clean by uv, the security FUD (and let not be naive here, the power struggle) move onto enshittifying PyPI and make Python user miserable for a decade more ?
I have a bit of uneasiness about how this is heavily pushing GitHub actions as the correct way to publish to PyPI. I had to check PEP740 to make sure it was not directly supported by Microsoft.
> The generation and publication of attestations happens by default, and no changes are necessary for projects that meet all of these conditions: publish from GitHub Actions; via Trusted Publishing; and use the pypa/gh-action-pypi-publish action to publish.
If you then click on "The manual way" it adds a big disclaimer:
> STOP! You probably don't need this section; it exists only to provide some internal details about how attestation generation and uploading work. If you're an ordinary user, it is strongly recommended that you use one of the official workflows described above.
Where the only official workflow is "Use GitHub Actions".
I guess I am an idealist but as a maintainer this falls short of my expectations for the openness of Python and PyPI.
> Where the only official workflow is "Use GitHub Actions".
The standard behind this (PEP 740) supports anything that can be used with Trusted Publishing[1]. That includes GitLab, Google Cloud, ActiveState, and can include any other OIDC IdP if people make a good case for including it.
It's not tied to Microsoft or GitHub in any particular way. The only reason it emphasizes GitHub Actions is because that's where the overwhelming majority of automatic publishing traffic comes from, and because it follows a similar enablement pattern as Trusted Publishing did (where we did GitHub first, followed by GitLab and other providers).
[1]: https://docs.pypi.org/trusted-publishers/
I get that, that's why I didn't go "This is Embrace Extend Extinguish", but as constructive feedback I would recommend softening the language and to replace:
> STOP! You probably don't need this section;
In https://docs.pypi.org/attestations/producing-attestations/#t...
Perhaps also add a few of the providers you listed as well?
> The only reason it emphasizes GitHub Actions is because that's where the overwhelming majority of automatic publishing traffic comes from
GitHub being popular is a self-reinforcing process, if GitHub is your first class citizen for something as crucial as trusted publishing then projects on GitHub will see a higher adoption and become the de-facto "secure choice".
PyPI should support and encourage open infrastructure.
If I don't want to use GitHub, let alone GitHub Actions, I am now effectively excluded from publishing my work on PyPI: quite unacceptable.
That’s now how any of this works. I am begging you to re-read the docs and understand that this does not require anybody to use GitHub, much less GitHub Actions, much less Trusted Publishing, much less attestations.
You can still, and will always be able to use API tokens.
> I am begging you to re-read the docs
The gp was pointing out that the docs heavily emphasise (& therefore encourage) GHA usage & suggested language changes.
If people are confused about what they need to use Trusted Publishing & you're suggesting (begging) a re-read as the solution, that seems evidence enough that the gp is correct about the docs needing a reword.
It could just as easily imply that people aren't paying attention when they read it. Inability to understand a text is not always on the author, plenty of times it's on the reader.
Well, yes and no. From the perspective of an infosec professional who is focussed on supply chain security I can tell you that your package having an attestation from a trusted platform like GitHub or GitLab gives me a warm feeling. It is not the only thing we will look at but definitely part of a longer list of checks to understand risk around dependencies.
With an attestation from GitHub I know at least that the workflow that ran it and the artifacts it produced will be 100% traceable and verifyable. This doesn't mean the code was not malicious, but for example it will rule out that someone did the build at home and attached an alternative version of an artifact to a GitHub release. Like how that was done with the xz project.
It is fine to not like GitHub, but I think that means we need more trusted builders. Developers cannot be pushed toward just GitHub.
> It is fine to not like GitHub, but I think that means we need more trusted builders. Developers cannot be pushed toward just GitHub.
Yes, agreed. This is why the docs explicitly say that we’re planning on enabling support for other publisher providers, like GitLab.
Thank you for being patient with people that seem to have willfully not read any of the docs or your clarifying comments here, are saying you are lying, and/or are making up hypothetical situations. It's appreciated!
Edit: woodruffw is sitting here and thoughtfully commenting and answering people despite how hostile some of the comments are (someone even said "This is probably deserving a criminal investigation"! and it has more upvotes than this comment). I think that should be appreciated, even if you don't like Python.
I know attestations are not mandatory, but the rug has already been pulled: PEP 740 distinguishes "good" and "bad" packages and "good" packages require GitHub.
Attestations are worthless unless they're checked. I have no doubt they'll eventually become the default in pip which effectively makes them mandatory for 99% of people not willing to jump through the hoops of installing an unattested package.
The "hoops", which will only increase in the future, make GitHub-dependent attested packages privileged and give GitHub (and maybe, in the future, other inappropriate entities) significant power over open source Python packages.
It does nothing of the sort, and the current GitHub requirement is an explicitly temporary restriction, like it was for Trusted Publishing. Again: I am begging you to read the docs that we’ve compiled for this.
> the current GitHub requirement is an explicitly temporary restriction
It seems reasonable to suggest that advertising a solution for public use at a point in time when support is at <2 systems is not an ideal way to encourage an open ecosystem.
It’s an eminently ideal way, given that the overwhelming majority of Python packages come from GitHub. It would be unreasonable to withhold an optional feature just because that optional feature is not universal yet.
Again, I need people to let this sink in: Trusted Publishing is not tied to GitHub. You can use Trusted Publishing with GitLab, and other providers too. You are not required to produce attestations, even if you use Trusted Publishing. Existing GitLab workflows that do Trusted Publishing are not changed or broken in any way by this feature, and will be given attestation support in the near future. This is all documented.
> It would be unreasonable to withhold an optional feature just because that optional feature is not universal yet.
The "reasonability" of this is dependent on your goals. If an open ecosystem isn't a priority, then your statement is indeed correct.
Let this sink in: a "security" feature that depends on Trusted Publishing providers puts the developer at the mercy of a small set of Trusted Publishing providers, and for most people none of them are acceptable feudal lords.
Let this sink in: if it is possible to check attestations, attestations will be checked and required by users, and PyPI packages without them will be used less. Whether PyPI requires attestations is unimportant.
> PyPI packages without them will be used less. Whether PyPI requires attestations is unimportant.
This points to a basic contradiction in how people are approaching open source: if you want your project to be popular on a massive scale, then you should expect those people to have opinions about how you're producing that project. That should not come as a surprise.
If, on the other hand, you want to run a project that isn't seeking popularity, then you have a reasonable expectation that people won't ask you for these things and you shouldn't want your packages downloads from PyPI as much as possible. When people do bug you for those things, explicitly rejecting them is (1) acceptable, and (2) should reduce the relative popularity of your project.
The combination of these things ("no social expectations and a high degree of implicit social trust/criticality") is incoherent and, more importantly, doesn't reflect observed behavior (people who do OSS as a hobby - like myself - do try and do the more secure things because there's a common acknowledgement of responsibility for important projects).
I don't use PyPI and only skimmed the docs. I think what you're saying here makes sense, but I also think others posting have valid concerns.
As a package consumer, I agree with what you've said. I would have a preference for packages that are built by a large, trusted provider. However, if I'm a package developer, the idea worries me a bit. I think the concerns others are raising are pragmatic because once a majority of developers start taking the easy path by choosing (ex) GitHub Actions, that becomes the de-facto standard and your options as a developer are to participate or be left out.
The problem for me is that I've seen the same scenario play out many times. No one is "forced" to use the options controlled by corporate interests, but that's where all the development effort is allocated and, as time goes on, the open source and independent options will simply disappear due the waning popularity that's caused by being more complex than the easier, corporate backed options.
At that point, you're picking platform winners because distribution by any other means becomes untenable or, even worse, forbidden if you decide that only attested packages are trustworthy and drop support for other means of publishing. Those platforms will end up with enormous control over what type of development is allowed. We have good examples of how it's bad for both developers and consumers too. Apple's App Store is the obvious one, but uBlock Origin is even better. In my opinion, Google changed their platform (Chrome) to break ad blockers.
I worry that future maintainers aren't guaranteed to share your ideals. How open is Open Solaris these days? MySQL? OpenOffice?
I think the development community would end up in a much stronger position if all of these systems started with an option for self-hosted, domain based attestations. What's more trustworthy in your mind; 1) this package was built and published by ublockorigin.com or 2) this package was built and published by GitHub Actions?
Can an impersonator gain trust by publishing via GitHub actions? What do the uninformed masses trust more? 1) an un-attested package from gorhill/uBlock, which is a user without a verified URL, etc. or 2) an attested package from ublockofficial/ublockofficial, which could be set up as an organization with ublockofficial.com as a verified URL?
I know uBlock Origin has nothing to do with PyPI, but it's the best example to make my point. The point being that attesting to a build tool-chain that happens to be run by a non-verifying identity provider doesn't solve all the problems related to identity, impersonation, etc.. At worst, it provides a false sense of trust because an attested package sounds like it's trustworthy, but it doesn't do anything to verify the trustworthiness of the source, does it?
I guess I think the term "Trusted Publisher" is wrong. Who's the publisher of uBlock Origin? Is it GitHub Actions or gorhill or Raymond Hill or ublockorigin.com? As a user, I would prefer to see an attestation from ublockorigin.com if I'm concerned about trustworthiness and only get to see one attestation. I know who that is, I trust them, and I don't care as much about the technology they're using behind the scenes to deliver the product because they have a proven track record of being trustworthy.
That said, I do agree with your point about gaining popularity and compromises that developers without an existing reputation may need to make. In those cases, I like the idea of having the option of getting a platform attestation since it adds some trustworthiness to the supply chain, but I don't think it should be labelled as more than that and I think it works better as one of many attestations where additional attestations could be used to provide better guarantees around identity.
Skimming the provenance link [1] in the docs, it says:
> It’s the verifiable information about software artifacts describing where, when and how something was produced.
Isn't who is responsible for an artifact the most important thing? Bad actors can use the same platforms and tooling as everyone else, so, while I agree that platform attestations are useful, I don't understand how they're much more than a verified (ex) "Built using GitHub" stamp.
To be clear, I think it's useful, but I hope it doesn't get mistakenly used as a way of automatically assuming project owners are trustworthy. It's also possible I've completely misunderstood the goals since I usually do better at evaluating things if I can try them and I don't publish anything to PyPI.
1. https://slsa.dev/spec/v1.0/provenance
> for most people none of them are acceptable feudal lords.
I really don't think this dramatic language is helping
This is exactly what I meant when I said "people that seem to have willfully not read any of the docs or your clarifying comments here"
Since roughly 2014 there have been so many rug pulls in the Python organization that no one believes anything any more.
It always starts small: "Oh, we just want everyone to be nice, so we have this super nice CoC document. Anyone who distrusts us is malicious."
Ten years later you have a corporate-backed inner circle that abuses the CoC to silence people, extend their power and earning potential and resort to defamation and libel.
It is possible that the people here who defend this corporate attestation-of-provenance-preferably-on-our-infrastructure scheme think that nothing nefarious will happen in the future. Well, given the repressive history of Python they are naive then.
It just takes pip to add a flag to enable "non-attested" packages. And of course they'll name it something like --allow-insecure-potentially-malicious.
They’re not encouraging open infrastructure. In fact, this whole design doesn’t even contemplate the possibility of self hosting. Trust must be blindly delegated to one of the existing partners.
> but as constructive feedback I would recommend softening the language and to replace:
I can soften it, but I think you're reading it excessively negatively: that warning is there to make sure people don't try to do the fiddly, error-prone cryptographic bits if they don't need to. It's a numerical fact that most project owners don't need that section, since most are either using manual API tokens or are publishing via GitHub Actions.
> Perhaps also add a few of the providers you listed as well?
They'll be added when they're enabled. Like I said in the original comment, we're using a similar enablement pattern as happened with Trusted Publishing: GitHub was enabled first because it represents the majority of publishing traffic, followed by GitLab and the others.
> GitHub being popular is a self-reinforcing process, if GitHub is your first class citizen for something as crucial as trusted publishing then projects on GitHub will see a higher adoption and become the de-facto "secure choice".
I agree, but I don't think this is PyPI's problem to solve. From a security perspective, PyPI should prioritize the platforms where the traffic is.
(I'll note that GitLab has been supported by Trusted Publishing for a while now, and they could make the publishing workflow more of a first class citizen, the way it is on GHA.)
> I agree, but I don't think this is PyPI's problem to solve. From a security perspective, PyPI should prioritize the platforms where the traffic is.
To me that's a bit of a weird statement, PyPI is part of the Python foundation, making sure that the project remains true to its open-source nature is reasonable?
My concern is that these type of things ultimately play out as "we are doing the right thing to limit supply chain attacks" which is good an defendable, but in ~5 years PyPI will have an announcement that they are sunsetting PyPI package upload in favor of the trusted provider system. pip (or other tooling) will add warnings whenever I install a package that is not "trusted". Maybe I am simply pessimistic.
That being said we can agree to disagree, I am not part of the PSF and I did preface my first comment with "I guess I am an idealist".
> making sure that the project remains true to its open-source nature is reasonable?
What about this, in your estimation, undermines the open-source nature of PyPI? Nothing about this is proprietary, and I can't think of any sane definition of OSS in which PyPI choosing to verify OIDC tokens from GitHub (among other IdPs!) meaningfully subverts PyPI's OSS committment.
> PyPI package upload in favor of the trusted provider system. pip (or other tooling) will add warnings whenever I install a package that is not "trusted". Maybe I am simply pessimistic.
Let me put it this way: if PyPI disables API tokens in favor of mandatory Trusted Publishing, I will eat my shoe on a livestream.
(I was the one of the engineers for both API tokens and Trusted Publishing on PyPI. They're complementary, and neither can replace the other.)
> What about this, in your estimation, undermines the open-source nature of PyPI?
Absence of support for self-hosting, in the spirit of freedom 0 = OSD 5&6? Or, for that matter, for any provider whose code is fully open source?
> Absence of support for self-hosting, or for that matter for any non-proprietary service?
This has nothing to do with self-hosting, whatsoever. You can upload to PyPI with an API token; that will always work and will not do anything related to Trusted Publishing, which exists entirely because it makes sense for large services.
PyPI isn't required to federate with the server in my basement through OpenID Connect to be considered open source.
I believe you that token uploads will continue to be possible, but it seems likely that in a couple of years trusted publishing & attestations will be effectively required for all but the tiniest project. You'll get issues and PRs to publish this way, and either you accept them, or you have to repeatedly justify what you've got against security.
And maybe that's a good thing? I'm not against security, and supply chain attacks are real. But it's still kind of sad that the amazing machines we all own are more and more just portals to the 'trusted' corporate clouds. And I think there are things that could be done to improve security with local uploads, but all the effort seems to go into the cloud path.
> I believe you that token uploads will continue to be possible, but it seems likely that in a couple of years trusted publishing & attestations will be effectively required for all but the tiniest project.
That's what I think will happen.
> And maybe that's a good thing? I'm not against security, and supply chain attacks are real.
The problem is the attestation is only for part of the supply chain. You can say "this artifact was built with GitHub Actions" and that's it.
If I'm using Gitea and Drone or self-hosted GitLab, I'm not going to get trusted publisher attestations even though I stick to best practices everywhere.
Contrast that with someone that runs as admin on the same PC they use for pirating software, has a passwordless GPG key that signs all their commits, and pushes to GitHub (Actions) for builds and deployments. That person will have more "verified" badges than me and, because of that, would out-compete me if we had similar looking projects.
The point being that knowing how part of the supply chain works isn't sufficient. Security considerations need to start the second your finger touches the power button on your PC. The build tool at the end of the development process is the tip of the iceberg and shouldn't be relied on as a primary indicator of trust. It can definitely be part of it, but only a small part IMO.
The only way a trusted publisher (aka platform) can reliably attest to the security of the supply chain is if they have complete control over your development environment which would include a boot-locked PC without admin rights, forced MFA with a trustworthy (aka their) authenticator, and development happening 100% on their cloud platform or with tools that come off a safe-list.
Even if everyone gets onboard with that idea it's not going to stop bad actors. It'll be exactly the same as bad actors setting up companies and buying EV code signing certificates. Anyone with enough money to buy into the platform will immediately be viewed with a baseline of trust that isn't justified.
As I understand it, the point of these attestations is that you can see what goes into a build on GitHub - if you look at the recorded commit on the recorded repo, you can be confident that the packages are made from that (unless your threat model is GitHub itself doing a supply chain attack). And the flip side of that is that if attestations become the norm, it's harder to slip malicious code into a package without it being noticed.
That's not everything, but it is a pretty big step. I don't love the way it reinforces dependence on a few big platforms, but I also don't have a great alternative to suggest.
Yeah, if the commit record acts like an audit log I think there’s a lot of value. I wonder how hard it is to get the exact environment used to build an artifact.
I’m a big fan of this style [1] of building base containers and think that keeping the container where you’ve stacked 4 layers (up to resources) makes sense. Call it a build container and keep it forever.
1. https://phauer.com/2019/no-fat-jar-in-docker-image/
Thank you for being the first person to make a non-conspiratorial argument here! I agree with your estimation: PyPI is not going to mandate this, but it’s possible that there will be social pressure from individual package consumers to adopt attestations.
This is an unfortunate double effect, and one that I’m aware of. That’s why the emphasis has been on enabling them by default for as many people as possible.
I also agree about the need for a local/self-hosted story. We’ve been thinking about how to enable similar attestations with email and domain identities, since PyPI does or could have the ability to verify both.
If there is time for someone to work on local uploads, a good starting point would be a nicer workflow for uploading with 2FA. At present you either have to store a long lived token somewhere to use for many uploads, and risk that it is stolen, or fiddle about creating & then removing a token to use for each release.
> or you have to repeatedly justify what you've got against security.
The only reason I started using PyPI was because I had a package on my website that someone else uploaded to PyPI, and I started getting support questions about it. The person did transfer control over to me - he was just trying to be helpful.
I stopped caring about PyPI with the 2FA requirement since I only have one device - my laptop - while they seem to expect that everyone is willing to buy a hardware device or has a smartphone, and I frankly don't care enough to figure it out since I didn't want to be there in the first place and no one paid me enough to care.
Which means there is a security issue whenever I make a new package available only on my website should someone decide to upload it to PyPI, perhaps along with a certain something extra, since people seem to think PyPI is authoritative and doesn't need checking.
The 2FA requirement doesn't need a smartphone. You can generate the same one time passwords on a laptop. I know Bitwarden has this functionality, and there are other apps out there if that's not your cup of tea. Sorry that you feel pressured, but it is significantly easier to express a dependency on a package if it's on PyPI than a download on your own site.
Sure. But PyPI provides zero details on the process, I don't use 2FA for anything else in my life, no one is paying me to care, I find making PyPI releases tedious because I inevitably make mistakes in my release process, I have a strong aversion to centralization and dependencies[1][2].
I tell people to "pip install -i $MY_SITE $MY_PACKAGE". I can tell from my download logs that this is open to dependency confusion attacks as I can see all the 404s from attempts to, for example, install NumPy from my server. To be clear, the switch to 2FA was only the triggering straw - I was already migrating my packages off of PyPI.
Finally, I sell a source license for a commercial product (which is not the one which got me started with PyPI). My customers install it via their internally-hosted PyPI mirrors.
I provide a binary package with a license manager for evaluation purposes, and as a marketing promotion. As such, I really want them to come to my web site, see the documentation and licensing options, and contact me. I think making it easier to express as a dependency via PyPI does not help my sales, and actually believe the extra intermediation likely hinders my sales.
[1] I dislike dependencies so much that I figured out how to make a PEP 517 compatible version that doesn't need to contact PyPI simply to install a local package. Clearly I will not become a Rust developer.
[2] PyPI support depends on GitHub issues. I regard Microsoft as a deeply immoral company, and a threat to personal and national data sovereignty, which means I will not sign up for a GitHub account. When MS provides IT support for the upcoming forced mass deportations, I will have already walked away from Omelas.
Have you maybe documented what you have done, so that others who want to follow the same path can look up some information?
No, I haven't. The main idea is to create your own in-tree build backend, described at https://peps.python.org/pep-0517/#in-tree-build-backends .
In short, use "backend-path" to include a subdirectory which contains your local copies of setuptools, wheel, etc. Create a file with the build hooks appropriate for "backend-path". Have that those hooks import the actual hooks in setuptools. Finally, set your "requires" to [].
Doing this means taking on a support burden of maintaining setuptools, wheels, etc. yourself. You'll also need to include their copyright statements in any distribution, even though the installed code doesn't use them.
(As I recall, that "etc" is hiding some effort to track down and install the full list of packages dragged in, but right now I don't have ready access to that code base.)
Thanks for the info.
>> PyPI package upload in favor of the trusted provider system. pip (or other tooling) will add warnings whenever I install a package that is not "trusted". Maybe I am simply pessimistic.
> Let me put it this way: if PyPI disables API tokens in favor of mandatory Trusted Publishing, I will eat my shoe on a livestream.
Yeah, sure. But parent poster also mentioned pip changes. Will you commit to eating a shoe if pip verifies attestations? Of course not, you know better than I do that those changes to pip are in the works and experimental implementations already available. You must have forgotten to mention that. PyPI doesn't have to be the bad guy on making sure its a PITA to use those packages. Insert Futurama "technically correct, the best kind of correct" meme.
> You must have forgotten to mention that.
Insinuating dishonesty is rude, especially when it’s baseless: the post linked in this one is explicit about experimenting with verification. There’s no point in signing without verification.
pip may or may not verify attestations; I don’t control what they do. But even if they do, they will never be able to mandate them for every project: as I have repeatedly said, there is no plan (much less desire) to mandate attestation generation.
Regardless of your intent, you didn't acknowledging the parent poster's concern about python overall. Just exculpated PyPI.
I believe you would have received a much less pushback if you hadn't been coy about the obviously impending quiet deprecation (for lack of a better phrase) you finally acknowledged elsewhere in the thread.
There is no quiet deprecation. Literally nothing is deprecated in this announcement.
I was reaching for a descriptor and ended up riffing off 'quiet quitting' and 'quiet firing'.
I understand you're not deprecating or recommending anything, and have learned in the meantime just how ugly python infighting is in and around package tooling. I can see the motivation to keep external messaging limited to saying a feature was added and everything else remains constant.
I work with a bunch of normies that think python packing starts and ends at "pip install"-ing systemwide on Windows. One day in the future the maintainers of the packages they use will likely be encouraged to use this optional new feature (and/or they already have been publicly). Even later in the future those end users themselves might be encouraged by warnings or errors not to install packages without this extra feature.
PyPI has deprecated nothing. Bureaucrat Conrad, you are technically correct. The best kind of correct.
> I understand you're not deprecating or recommending anything, and have learned in the meantime just how ugly python infighting is in and around package tooling. I can see the motivation to keep external messaging limited to saying a feature was added and everything else remains constant.
You're insinuating an ulterior motive. Please don't do that; assume good faith[1].
(You can find trivial counterexamples that falsify this: PyPI was loud and explicit about deprecating and disabling PGP; there's no reason to believe it would be any less loud and explicit about a change here.)
[1]: https://news.ycombinator.com/newsguidelines.html
I'm with @belval on this one, it's ok to prioritize github, but people that want the standard to implement an alternative should not feel like they are doing something that may not be supported.
It kinda feels like that right now.
Again, to be clear: the standard does not stipulate GitHub or any other specific identity providers. The plan is to enable GitLab and the other Trusted Publisher providers in short order.
This is exactly the same as Trusted Publishing, where people accused the feature of being a MSFT trojan horse because GitHub was enabled first. I think it would behoove everybody to assume the best intentions here and remember that the goal is to secure the most people by default.
I think the point is that this needs to be made clearer in the official docs from the get go.
It's said explicitly in the second sentence in the usage docs[1].
> Attestations are currently only supported when uploading with Trusted Publishing, and currently only with GitHub-based Trusted Publishers. Support for other Trusted Publishers is planned. See #17001 for additional information.
[1]: https://docs.pypi.org/attestations/producing-attestations/
At this point it should be fairly obvious that if you have to defend the phrasing in multiple threads here on HN, get some folks to help rephrase the current document instead so you can comment with "we updated the text to make it clear this is a first pass and more options are getting added to the doc soon".
If you draw an ugly cat, and someone tells you it's ugly, it doesn't matter how much you insist that it isn't, and the same is true for docs. It doesn't matter what your intention was: if people keep falling over the same phrasing, just rephrase it. You're not your writing, it's just text to help support your product, and if that text is causing problems just change it (with the help of some reviewers, because it's clear you think this is phrased well enough, but you're not the target audience for this document, and the target audience is complaining).
Anyone can run an OIDC system if they want. But PyPI is not under an obligation to trust an OIDC provider running on a random rpi3 in your basement. More than that, GitHub is "trusted" because we can be pretty sure they have an on-call staff to handle incidents, that they can reliably say "This token was provided on behalf of this user at this time for this build", etc.
Even if you standardized the more technical parts like OIDC claim metadata (which is 100% provider specific), it wouldn't really change the thrust of any of this — PyPI is literally trusting the publisher in a social sense, not in some "They comply with RFC standards and therefore I can plug in my random favorite thing" sense.
This whole discussion is basically a non-issue, IMO. If you want to publish stuff from your super-duper-secret underground airgapped base buried a mile underneath the Himalayas, you can use an API token like you have been able to. It will be far less hassle than running your own OIDC solution for this stuff.
If I can't build on a rpi3 in my basement and am forced to use GitHub that's exactly against the spirit of open source
You still can. You just use an API token with PyPI.
Please improve your reading comprehension. I swear, this website is embarassing sometimes. You can still do this with an API Token. You can upload from a C64 with an API token. What you cannot do is run some random OIDC provider on your random useless domain and have PyPI magically respect it and include as part of the Trusted Publishers program. There is no point in it, because the program itself is constrained by design because it only provides any benefit at "large scale." Your random dumb server providing a login for you alone does not provide any benefits over you just using an API Token.
Any pathway to provide trusted attestations for random individual Hacker News users like yourself will, in fact, require a different design.
> error-prone cryptographic bits if they don't need to
They can't. Because you wouldn't accept their key anyway.
I think you're missing something. The key in question is a short-lived ECDSA key that lives inside a publishing workflow and is destroyed after signing; neither GitHub nor the Sigstore CA generates a signing key for you.
PyPI will accept any key bound to an identity, provided we know how to verify that identity. Right now that means we accept Trusted Publishing identities, and GitHub identities in particular, since that's where the overwhelming majority of Python package publishing traffic comes from. Like what happened Trusted Publishing, this will be expanded to other identities (like GitLab repositories) as we roll it out.
How does pypi know I'm not github? Because I can sign with my keys and not with github's key.
Never mind all the low level details of the temporary keys and hashes and all of that. This is an high level comment not a university book about security.
It requires an OIDC IdP, though… with PGP, I can verify that my identity is constant, but now, I’m reliant on some chosen third-party. And it has to keep being the same one, to boot! I don’t like the lock-in this causes, and I definitely don’t like the possibility of my right to publish being revoked by a third party.
I’m only now learning about what OIDC IdP is (for those like me: openid connect identity provider). But from my reading, a self hosted gitlab can function as an oidc idp.
That would be enough control, right?
You can't use a self-hosted Gitlab because you can only use a "trusted publisher".
There's no hard technical reason for that. It's mostly that PyPI only want to trust certain issuers who they think will look after their signing keys responsibly.
Sounds like PyPI have been corrupted? Hopefully The Foundation can remind them of their mission.
Good luck with that…
There is a technical reason for it, and it’s explained in an adjacent thread. Accepting every single small-scale IdP would result in a strictly worse security posture for PyPI as a whole, with no actual benefit to small instances (who are better off provisioning API tokens the normal way instead of using Trusted Publishing).
I don't think that's a technical reason, that's more of a social reason about trust. Adding configurable JWT issuers is not technically hard or difficult to do in a way that doesn't compromise PyPI's inherent security. There's a configuration difficulty, but those using it will generally know what they're doing and it only compromises their own packages (which they can already compromise by losing tokens etc.)
I disagree with your assessment that provisioning API tokens is better than being able to authenticate with a JWT. It makes managing credentials in an organisation much easier as far fewer people need access to the signing key compared to who would need access to the API token. Using asymmetric keys also means there's less opportunity for keys to be leaked.
Adding another IdP is not the hard part; establishing the set of claims that IdP should be presenting to interoperate with PyPI is. The technical/social distinction is specious in this context: the technical aspects are hard because the social aspects are hard, and vice versa.
If you work in a large organization that has the ability to maintain a PKI for OIDC, you should open an issue on PyPI discussing its possible inclusion as a supported provider. The docs[1] explain the process and requirements.
[1]: https://docs.pypi.org/trusted-publishers/internals/
> who are better off provisioning API tokens the normal way
As long as those packages get digital attestation, perhaps attested by PyPI itself post-upload or from a well-known user provided key similar to how GPG worked but managed by PyPI this time.
Surely you see how this is creating two classes of packages, where the favored one requires you use a blessed 3rd party?
No, I don't. There's no plan, and there will be no plan, to make installers reject packages that don't have attestations. Doing so is technically and socially impossible, and undesirable to boot.
The strongest possible version of this is that projects that do provide attestations will be checked by installers for changes in identity (or regression in attestations). In other words, this feature will only affect packages that opt to provide attestations. And even that is uncertain, since Python packaging is devolved and neither I (nor anybody else involved in this effort) has literally any ability to force installers to do anything.
"Ordinary user" here.
It comes across like the goal of this system is to prove to my users a) who I am; b) that I am working in cooperation with some legitimate big business. I don't understand why I should want to prove such things, nor why it's desirable for b) to even be true, nor why my users should particularly care about a) at all, whether I can prove it or not.
I think my users should only care that the actual contents of the sdist match, and the built contents of the wheel correspond to, the source available at the location described in the README.
Yes, byte-for-byte reproducible builds would be a problem, for those who don't develop pure Python. Part of why sdists exist is so that people who can't trust the wheel, like distro maintainers, can build things themselves. And really - why would a distro maintainer take I take "I can cryptographically prove the source came from zahlman, and trust me, this wheel corresponds to the source" from $big_company any more seriously than "trust me, this wheel corresponds to the source" directly from me?
If I have to route my code through one of these companies ("Don't worry, you can work with Google instead!" is not a good response to people who don't want to work with Microsoft) to gain a compliance checkmark, and figure out how someone else's CI system works (for many years it was perfectly possible for me to write thousands of lines of code and never even have an awareness that there is such a thing as CI) so that I can actually use the publishing system, and learn what things like "OIDC IdP" are so that I can talk to the people in charge...
... then I might as well learn the internal details of how it works.
And, in fact, if those people are suggesting that I shouldn't worry about such details, I count that as a red flag.
I thought open source was about transparency.
There's two alternate visions: anybody can work with anybody they trust, and Only BigCo Can Protect You.
We know the first falls apart because the Web of Trust fell apart. When the UI asks people "hey, this is new, here's the fingerprint, do you trust it?", nobody actually conducts review, they just mindlessly click "yes" and move on. Very, very, very few people will actually conduct full review of all changes. Even massive organizations that want to lie to themselves that they review the changes, end up just outsourcing the responsibility to someone who is willing to legally accept the risk then they go ahead and just mindlessly click "yes" anyway. There are just too many changes to keep track of and too many ways to obscure malicious code.
We know the second falls apart because human beings are both the inevitable and ultimate weakest link in even the robustest security architectures. Human beings will be compromised and the blast radius on these highly centralized architectures will be immense.
Pick your poison.
> And, in fact, if those people are suggesting that I shouldn't worry about such details, I count that as a red flag.
Nobody is suggesting that. There's an annotation on the internals documentation to make sure that people land on the happy path when it makes sense for them. If you want to learn about the internals of how this works, I heartily encourage you to. I'd even be happy to discuss it over a call, or chat, or any other medium you'd like: none of this is intended to be cloak-and-dagger.
(You can reach this conclusion abductively: if any of this was intended to be opaque, why would we write a public standard, or public documentation, or a bunch of high-profile public announcements for it?)
> Part of why sdists exist is so that people who can't trust the wheel, like distro maintainers, can build things themselves.
Some might even expect pypi to build everything going into it's repo similar to a distro building for it's repos. As far as I can tell, all this 'attestation' is just pypi making microsoft pay for the compute time with extra steps.
Except that also for trusted publishing, they only allowed github in the beginning and eventually added a couple of other providers. But if you're not google or microsoft you won't be added.
These kinds of comments are borderline mendacious: you can observe, trivially, that 50% of the Trusted Publishers currently known to PyPI are neither Google nor Microsoft controlled[1].
If PyPI accepts two more likely ones, a full 2/3rds will unrelated to GitHub.
[1]: https://docs.pypi.org/trusted-publishers/adding-a-publisher/
Ping me when one of them will be an open source entity rather than a company.
https://docs.pypi.org/trusted-publishers/internals/#how-do-i...
Wow. I get to choose one from a total of FOUR large corporations! Amazing openness!
Once again: this is constrained by design. If you don’t want to use OpenID Connect, just create a token on PyPI and publish the normal way. You are not, and will never be, required to use this feature.
Wow, you can use a whole two other providers from your list: Gitlab and ActiveState. Color me unimpressed.
Why does this need to allowlist CI providers in first place? Why not publish an open interface any CI provider can integrate against?
Because the security benefit of Trusted Publishing via OIDC versus normal API tokens is marginal at small scales, in two senses:
1. The primary benefit of Trusted Publishing over a manual API token is knowing that the underlying OIDC IdP has an on-call staff, proper key management and rotation policies, etc. These can be guaranteed for GitHub, GitLab, etc., but they're harder to prove for one-off self-hosted CI setups. For the latter case, the user is no better off than they would be with a manual API token, which is still (and will always be) supported.
2. If the overwhelming majority of traffic comes from a single CI/CD provider, adding more code to support generic OIDC IdPs increases PyPI's attack surface for only marginal user benefit.
There also is no "open interface" for PyPI to really use here: this is all built on OIDC, but each OIDC provider needs to have its unique claims mapped to something intelligible by PyPI. That step requires thoughtful, manual, per-IdP consideration to avoid security issues.
I still think this is overly strict. Supporting arbitrary OIDC providers is not excessively complex or particularly rare, the major cloud providers all support it in one way or another [1][2][3], as does Hashicorp Vault [4]. I disagree that the primary benefit over a manual API token is _knowing_ that the OIDC IdP is following the best practices you talk about. Having it rely on asymmetric keys makes the process more secure and scalable than API tokens for those that choose to use it.
I think there's a separate question around trust. But I think blocking non-trusted publishers from using a more secure form of authentication isn't the answer. Instead I think it makes more sense to use nudges in the PyPI UI and eventually of consumers (e.g. pip) to indicate that packages have come from non-trusted publishers.
[1] https://docs.aws.amazon.com/IAM/latest/UserGuide/id_roles_pr... [2] https://learn.microsoft.com/en-us/graph/api/resources/federa... [3] https://cloud.google.com/identity-platform/docs/web/oidc [4] https://developer.hashicorp.com/vault/docs/auth/jwt
I can create a github account. How does that make me trustworthy? How being able to create a github account prevents me from adding a virus in my module?
It's not about the package maintainer, it's about the trustworthiness of the OIDC issuer to prove the identity of a user.
A poorly maintained issuer could leak their secret keys, allowing anyone to impersonate any package from their service.
But what use does it serve to prove that I am user "qioaisjqowihjdoaih" on github?
I mean it only proves I authenticated successfully. Nothing else.
It proves that a package was definitely uploaded from the correct repo.
Without trusted publishers a nefarious actor could use a leaked PyPI API key to upload from anywhere. If the only authorised location is actions on a specific Github repo then it makes a supply chain attack much trickier and much more visible.
With the new attestations it's now possible for package consumers to verify where the package came from too.
But… a github token could leak just as easily?
I think I would be better off with API key + PGP than API key alone. And that’s being phased out?
You can no longer upload a PGP signature to PyPI, if that's what you mean. That was phased out last year (to virtually no complaint since nobody was actually verifying any of the signatures, much less attempting to confirm that their keys were discoverable[1]).
[1]: https://blog.yossarian.net/2023/05/21/PGP-signatures-on-PyPI...
> to virtually no complaint since nobody was actually verifying any of the signatures
And this is in no way a consequence of pypi stopping to host public keys right? Say the whole story at least… Say that there used to be a way to verify the signatures but you dropped it years ago and since then the signatures have been useless.
If it did, it was well before I ever began to work on PyPI. By the time I came around, PGP signature support was vestigial twice over and the public key discovery network on the Internet was rapidly imploding.
(But also: having PyPI be the keyserver defeats the point, since PyPI could then trivially replace my package's key. If that's the "whole story," it's not a very good one.)
This attestation doesn’t change a ton with that, though. The point is to provide chain of custody — it got to my computer, from pypi, from ???. The PGP signature, much like a self-signed android app, verifies that it continues to be the same person.
The critical difference with this architecture is that it doesn’t require key discovery or identity mapping: those are properties of the key infrastructure, similarly to the Web PKI.
Your self-signed app analogy is apt: self-signing without a strong claimant identity proof is a solution, but a much weaker one than we wanted.
It was a plural you. I have no idea who you personally are.
> [...] the user is no better off than they would be with a manual API token, which is still (and will always be) supported.
This is good to know. I did not see related statements in of the documents linked to this discussion, though.
I am not sure why my comment above is downvoted -- if you know where the perpetual optionality of digital attestations is officially stated, please, provide a link.
Because every CI/ID provider has a different set of claims and behaviors that would constitute a "secure" policy for verification. If there was one singular way to do that then we could, but there isn't yet so PyPI needs to onboard providers piecemeal. The work to add a new provider is not massive, the reason there are not tons of providers isn't because the work is hard but rather because people are voting with their feet so Github and Gitlab make sense as initial providers to support.
That's good. IIRC the feature was launched with GitHub as the only option, which is the same state of affairs of https://jsr.io/ right now.
IMO the launch should have been delayed until there was more than one trusted publisher.
Even more, the previous way was to use GPG signatures, which were recently deprecated and removed. So you don't really have a choice.
>Where the only official workflow is "Use GitHub Actions".
Well you can do it manually with other solutions... as long as they are one of the four trusted publishers (see "Producing attestations manually does not bypass (...) restrictions on (...) Trusted Publishers":
https://docs.pypi.org/trusted-publishers/adding-a-publisher/...
This means that you literally can't do it manually, you have to rely on one of:
* Github
* Google Cloud
* ActiveState (I'm not familiar with it)
* Github.com (not just github, only that one instance)
Really surprising development, IMO.
It looks a lot like reinventing the wheel, but as an octagon.
On the one hand, you are totally right, GitHub Actions are the VS Code of automation. People choose them because they are broke, they work well enough, and the average person chooses things based on how it looks. GitHub Actions looks easy, it's cozy, it's comfy, VS Code looks like a code editor, everything has to be cozy and comfy.
On the other hand, considering all of that, you can see why Python would arrive at this design. They are the only other people besides NPM who regularly have supply chain attack problems. They are seemingly completely unopinionated about how to fix their supply chain problems, while being opinionated about the lack of opinions about packaging in general. What is the end goal? Presumably one of the giant media companies, Meta, Google or Microsoft maybe, have to take over the development of runtime and PEP process, but does that even sound good?
I think the end goal is to only allow github/google whatever accounts to publish code on Pypi, so importing whatever will be USA-sanctions safe.
The burden to ban russians/koreans/iranians will be on those big companies and pypi will be able to claim they respect the rules without having the resources themselves to ban accounts from sanctioned countries.
So in my opinion, in the future tooling will have some option to check if the software was uploaded via microsoft et al and can be considered unsanctioned in USA.
Or they might just say that's how you upload now and that's it.
> so importing whatever will be USA-sanctions safe...
You are talking about guys who can barely figure out how to structure a Python package repository. They are not doing this kind of 4D chess.
And anyway, none of that makes any sense.
The people paying for this, not them.
None of this is true. There is no plan to disable API token uploads to PyPI or to mandate Trusted Publishing, much less attestations. It's entirely intended to be a misuse-resistant way to upload to the index in the "happy case" where the package uses a CI/CD workflow that supports OpenID Connect.
(I'm disappointed by the degree to which people seem to gleefully conspiracize about Python packaging, and consequently uncritically accept the worse possible interpretation of anything done to try and improve the ecosystem.)
>(I'm disappointed by the degree to which people seem to gleefully conspiracize about Python packaging, and consequently uncritically accept the worse possible interpretation of anything done to try and improve the ecosystem.)
I sympathize, but there are obvious reasons for this: low trust in the Python packaging system generally (both because of recent PSF activity and because of the history of complications and long-standing problems), and the fact that many similar things have been seen in the Linux and open-source worlds lately (for one example, Linux Mint's new stance on Flatpak and how they present Flatpak options in the Software Manager). (Edit: it also kinda rhymes with the whole "Secure Boot" thing and how much trouble it causes for Linux installation.)
I personally expect people to have better compartmentalization skills than this. It's both unreasonable and inappropriate to have a general chip on your shoulder, and consequently assume a negative interpretation of an unrelated security effort.
It happens that people don't trust you if you are being obviously dishonest.
I still have to understand how the github credentials stored on my computer are harder to steal than the pypi credentials stored on the very same computer.
If you can explain this convincingly, maybe I'll start to believe some of the things you claim.
> I still have to understand how the github credentials stored on my computer are harder to steal than the pypi credentials stored on the very same computer.
This is not, and has never been, the argument for Trusted Publishing. The argument is that temporary, automatically scoped credentials are less dangerous than permanent, user scoped credentials, and that an attacker who does steal them will struggle to maintain persistence or pivot to other scoped projects.
The documentation is explicit[1] about needing to secure your CI workflows, and treating them as equivalent to long-lived API tokens in terms of security practices. Using a Trusted Publisher does not absolve you of your security obligations; it only reduces the scope of failures within those obligations.
[1]: https://docs.pypi.org/trusted-publishers/security-model/
> The argument is that temporary, automatically scoped credentials are less dangerous than permanent, user scoped credentials
Nobody forced you to implement never expiring global tokens in pypi…
Indeed. I have to wonder: if being able to authenticate to GitHub, upload one's code there, and then coordinate a transfer from there to PyPI through the approved mechanisms is good enough... then why can't PyPI just use the same authentication method GitHub does?
It seems hard to justify "we want to ensure the build is reproducible" when there is no system to attempt the build on PyPI's end and the default installer is perfectly happy to attempt the build (when no prebuilt version is available) on the user's machine - orchestrated by arbitrary code, and without confirmation as a default - without a chance to review the downloaded source first.
It seems hard to justify "we want to limit the scope of the authentication" when "a few minutes" is more than enough time for someone who somehow MITMed a major package to insert pre-prepared malware, and access to a single package from a major project affects far more machines than access to every repo of some random unimportant developer.
(The wheel format has hashes, but the sdist format doesn't. If these sorts of attacks aren't meaningfully mitigated by wheel hashes, what is the RECORD file actually accomplishing? If they are, shouldn't sdists have them too?)
My github credentials are all stored in hardware devices.
Yes a computer is a hardware device.
I don't think you can predict the future any more than me.
I don't baselessly speculate about it. But yes, I do think I can predict with some confidence that PyPI is not going to remove API token access.
I understand the desire to raise awareness about your fringe concern. But it makes no sense.
Funny how I have a talk about the evolution of pypi's security at the upcoming minidebconf this weekend.
I guess I'll have to update my slides :D
But well as a debian developer my advice is to just use debian and completely ignore pypi, so I might be slightly biased.
except that is impractical at best. The Debian (and Ubuntu) python packaging groups have been overwhelmed since Python2.7. no way at all does the Debian packaging ecosystem reasonably satisfy python packages.. And there is more.. the Debian system does what it must do -- ensure system Python has integrity with major software stacks and the OS use of Python. Super! that is not at all the needs of users of python. As you know, Python has passed Javascript as the most used language on Github. There is tremendous pace and breadth to several important Python uses, with their libraries.
Please reconsider this position with Debian and python packaging.
> Please reconsider this position with Debian and python packaging.
Just FYI, I'm not sending a swat team to your home to force you to follow my advice.
Does a .deb file even allow for installing for an arbitrary Python (i.e. not the system installation)? Am I supposed to find them through Apt and hope the Debian version of the package has a `python3` prefix on the name? Is that going to help if there are dependencies outside the Apt system?
I think you're being overly critical. When it says that it adds support for Trusted Publishers, it links directly to this page: https://docs.pypi.org/trusted-publishers/.
This page clearly explains how this uses OIDC, and uses GitHub Actions as an example. At no point in my read did I feel like this was shilling me a microsoft product.
Try to run your own gitlab instance and use this to publish. See what happens…
At this moment this only supports github.
I might be reading this wrong but doesn't it include four workflows, including Gitlab? The "manual way" section is grouped under the Github tab, which indicates to me that it's only explaining how the Github-specific workflow works?
https://docs.pypi.org/trusted-publishers/creating-a-project-...
I was referring to the "Get Started" section that links to a producing attestation page that can be found here: https://docs.pypi.org/attestations/producing-attestations/#t....
> Support for automatic attestation generation and publication from other Trusted Publisher environments is planned. While not recommended, maintainers can also manually generate and publish attestations.
You're on a slightly different page: Trusted Publishing supports four different providers, but this initial enablement of attestations (which are built on top of Trusted Publishing) is for GitHub only. The plan is to enable the others in short order.
I think y’all are being overly aggressive against what is from an outside perspective looks like a mvp targeting the majority use case.
If the dev’s lower in the comments claim that other support is soon to be added doesn’t pan out then sure this is a valid argument, but it feels very premature and ignores the reality of launch expectations of most agile based teams.
While I understand that, are more deep problem is that their main competitor (gut lab) is too busy to implement the fad of the day. Got labs cicd is already an unholy mess of scripting in yaml, and there is no good way to provide encapsulated pieces of code to gitlab. GitHub has actions which have a defined input and output, gitlab has nothing comparable. That makes it very difficult to provide functionality for gitlab.
Is it really that hard to provide a gitlab workflow?
AFAIU it'd be enough to have a docker image and use that in a standalone workflow which is a few lines of yaml in both GitHub and gitlab, what'd be the problem with that?
CI/CD components and steps.
And the in-detail explanation is full of references to specific third-party services, like Fulcio and SigStore...
I'm not really convinced of the value of such attestations until a second party can reproduce the build themselves on their own hardware.
Putting aside the fact that the mechanisms underpinning Github Actions are a mystery black box, the vast vast vast majority of github workflows are not built in a reproducible way - it's not even something that's encouraged by Github Actions' architecture, which emphasises Actions' container images that are little more than packaged installer scripts that go and download dependencies from random parts of the internet at runtime. An "attestation" makes no guarantee that one of these randomly fetched dependencies hasn't been usurped.
This is not to go into the poor security record of Github Actions' permissions model, which has brought us all a number of "oh shit" moments.
Fully reproducible builds would of course be nicer from a security standpoint, but attestations have vastly lower implementation costs and scale much better while still raising the bar meaningfully.
After 2FA, the previous PyPI buzzword that was forced on everyone, JFrog discovered a key leak that compromised everything:
https://news.ycombinator.com/item?id=40941809
JFrog also discovered multiple malicious package exploits later.
Now we get a Github centric new buzzword that could be replaced by trusted SHA256 sums. Python is also big on business speak like SBOM. The above key leak of course occurred after all these new security "experts" manifested themselves out of nowhere.
The procedure remains the same. Download a package from the original creators, audit it, use a local repo and block PyPI.
Hello! I believe I'm one of the "manifested" security experts you're hinting at :)
Good security doesn't demand perfection, that's why security is both prevention and preparedness. The response from our admin was in every way beyond what you'd expect from many other orgs: prompt response (on the order of minutes), full audit of activity for the credential (none found), and full public disclosure ahead of the security researcher's report.
> JFrog also discovered multiple malicious package exploits later.
If you're referencing malicious packages on PyPI then yes! We want to keep PyPI freely open to all to use, and that has negative knock-on effects. Turns out that exploiting public good code repositories is quite popular, but I maintain that the impact is quite low and that our ability to respond to these sorts of attacks is also very good due to our network of security engineers and volunteers who are triaging their reports. Thanks to the work of Mike Fiedler (quarantining packages, API for reporting malicious packages, better UI for triagers) our ability to respond to malicious packages will become even better.
> Now we get a Github centric new buzzword that could be replaced by trusted SHA256 sums.
In a way, this feature is what you're describing but is easier to automate (therefore: good for you as a user) and is more likely to be correct because every attestation is verified by PyPI before it's made available to others (which is also good for users). The focus on GitHub Actions is because this is where many Python projects publish from, there is little incentive to create a feature that no one will use.
> Python is also big on business speak like SBOM.
Indeed, there is legislation in many places that will require SBOMs for all software placed in their markets so there is plenty of interest in these standards. I'm working on this myself to try to do the most we can for users while minimizing the impact this will have on upstream open source project maintainers.
>In a way, this feature is what you're describing but is easier to automate (therefore: good for you as a user) and is more likely to be correct because every attestation is verified by PyPI before it's made available to others (which is also good for users).
How long should I expect it to take until I can automatically generate an attestation from `twine`? Or does someone else have to sign off on it through some OpenID mumbo-jumbo before I can qualify as "trusted"?
Automating the creation of SBOMs sounds even further out, since we're still struggling with actually just building sdists in the first place.
> Download a package from the original creators
Don't download packages from PyPI, go upstream to the actual source code on GitHub, audit that, build locally, verify your build is the same as the PyPI one, check the audits people have posted using crev, decide if you trust any of them, upload your audit to crev too.
https://reproducible-builds.org/ https://github.com/crev-dev/
After reading the underlying report (https://jfrog.com/blog/leaked-pypi-secret-token-revealed-in-...), I can't help but think: "where is the defense in depth?" Since `.pyc` files are just a cache of compilation that's already generally pretty quick, this could have been prevented by systems that simply didn't allow for pushing them into the Docker image in the first place. Or by having `PYTHONDONTWRITEBYTECODE=1` set on the developer's machine.
(Also, now I'm trying to wrap my head around the fact that there's such a thing as "Docker Hub" in the first place, and that people feel comfortable using it.)
> now I'm trying to wrap my head around the fact that there's such a thing as "Docker Hub" in the first place
Unless you build all of your images `FROM scratch` by default (or use in-house registries or quay or whatnot for all of your base images), you've used Docker Hub too...
Yeah at work we build our images from scratch of course.
[flagged]
From the [docs](https://docs.pypi.org/trusted-publishers/internals/):
> Reliability & notability: The effort necessary to integrate with a new Trusted Publisher is not exceptional, but not trivial either. In the interest of making the best use of PyPI's finite resources, we only plan to support platforms that have a reasonable level of usage among PyPI users for publishing. Additionally, we have high standards for overall reliability and security in the operation of a supported Identity Provider: in practice, this means that a home-grown or personal use IdP will not be eligible.
From what I can tell, this means using self-hosted Github/Gitlab/Gitea/whatever instances is explicitly not supported for most projects. I can't really tell what "a reasonable level of usage among PyPI users" means (does the invent.kde.org qualify? gitlab.gnome.org? gitlab.postmarketos.org?) but I feel like this means "Gitlab.com and maybe gitea.com" based on my original reading.
Of course by definition PyPI is already a massive single point of failure, but the focus on a few (corporate) partners to allow secure publishing feels like a mistake to me. I'm not sure what part of the cryptographic verification process restricts this API to the use of a few specific cloud providers.
They even got some public money from Germany's sovreign tech fund to couple uploads with gigantic USA companies.
This is probably deserving a criminal investigation since it appears the funds were probably misused?
Well done guys! Good job!
You can count me among those that are suspicious that this is a frog-boiling step, but it doesn't appear to me that STF money went to this, from https://www.sovereign.tech/tech/python-package-index#what-ar...
Maybe there is a case to be made for STF to fund making Codeberg (a German-headquartered organization) one of the PyPI trusted hosts. If Codeberg were supported, that would go a long way to addressing fears. And conversely, if Codeberg can't meet PyPI's bar, that suggests complete commercial capture.
Yes I agree. It'd be nice if codeberg got some funding and could be used to upload stuff in the same way github can.
Then pypi wouldn't just me a means to the end of keeping people on github rather than going to open platforms.
Thanks for pointing this out! In general, the sovereign tech fund should not fund Python or the PSF, since the administration is U.S. dominated and the whole organization is captured by U.S. corporations.
Unpaid developers from Europe (and Asia) just serve as disposable work horses for the U.S. "elites".
I'm just sad I didn't think to write this earlier and now it won't be seen by many.
Supply chain security is very important, and this seems like an important step. Seems absolutely essential that something like the Mozilla foundation, or EFF, or some other open-source friendly entity help provide such a service, instead of corralling users into companies with exploitative business models.
I am in no hurry to be pushed into using Github, Gitlab or whatever else. Programmer's open source code has been monetized by these companies to feed AI LLM beasties, and it's fundamentally objectionable to me. I self-host my code using Gitea for that reason.
Congratulations to everyone involved in this work! This is an incredible building block for not just supply-chain security both inside and downstream of the PyPI ecosystem but also for data analysis of open source projects (through strong linkage back to source repositories). Thank you all <3
Here is the GitHub issue you can subscribe to for automatically generated attestations in the GitHub CI PyPi upload action: https://github.com/pypa/gh-action-pypi-publish/issues/288
Just a note: that issue is for connecting the already-present attestations to GitHub's attestations feature[1]. Attestations are already automatically generated and uploaded to PyPI regardless of that issue.
(That issue will still be great to resolve, but just to avoid any confusion about its scope.)
[1]: https://docs.github.com/en/actions/security-for-github-actio...
Very cool, and congrats!
The corresponding ToB blog post says the following:
> Longer term, we can do even better: doing “one off” verifications means that the client has no recollection of which identities should be trusted for which distributions. To address this, installation tools need a notion of “trust on first use” for signing identities, meaning that subsequent installations can be halted and inspected by a user if the attesting identity changes (or the package becomes unattested between versions).
Agree: signing is only as good as verification. However, trust-on-first-use (TOFU) is not the most secure way to map packages to attestations because nothing stops attackers who have taken over PyPI from tampering with the _unsigned_ mapping of identities to attestations in package lockfiles for new clients (certainly in containerized environments where everything could look new), and even just new versions of packages.
Although [PEP 458](https://peps.python.org/pep-0458/) is about signing the Python package index, it sets the foundation for being able to securely map packages to signed in-toto _policies_, which would in turn securely map identities to attestations. I think it is worth noting how these different PEPs can work together :)
The root point of this seems to be PyPI does not have the resources to manage user identity, and wants to outsource that component to Github, et al. That sounds fairly reasonable. But why deprecate GPG signatures? The problem with GPG signatures as I understand it is it's difficult to find the associated public key. That's fair. Why not host and allow users to add their public keys to their accounts? Wouldn't that solve the problem?
GPG is an ancient bit of tech with numerous problems:
* An extremely complex, byzantine packet format with security problems of its own.
* Decades of backwards compatibility, which also harms security.
* Extreme unfriendliness towards automation.
* Way too many features.
* Encouragement of bad security practices like extremely long lived keys.
* Moribund and flawed ecosystem.
Lots of cryptographers agree that PGP has outlived its usefulness and it's time to put it out of its misery.
And really there's little need for GPG when package signing can be done more reliably and with less work without it.
I was a fan of PGP since the early days, but I agree that at this point it's best to abandon it.
I'll take your word that GPG is outdated. GPG is just the one that PyPI used to support. I don't particularly care what public key signing suite is used.
I'm very excited this has come to fruition!
PyPI's user docs[1] have some more details, as does the underlying PEP[2].
[1]: https://docs.pypi.org/attestations/
[2]: https://peps.python.org/pep-0740/
This is awesome to see, and the result of many years of hard work from awesome people.
Despite crypto getting easier and easier + more main stream. You often don't see it used that often. Even in ((('"```blockchain'"```))) projects (wink emoji; skull emoji; frown; picture of a chart crashing.) I welcome these additions to PyPI. I am a proud Python Chad and always will be.
Wonder when PyPI will be doing bootstrappable and reproducible builds.
https://bootstrappable.org/ https://reproducible-builds.org/
Given that it doesn't do builds, never?
Great, now how do you use attestations with Twine when publishing packages on PyPI outside of the Github ecosystem?
You don't. The whole point is that you can no longer sign anything. Microsoft signs for you.
And of course the signature means "this user can push to github" and nothing more.
Hopefully the attestation is bound to a specific commit, so you can know the binaries came from the source?
Otherwise I don't get it.
Yes, it’s bound to a specific commit; we just don’t present that in the web UI yet. If you click on the transparency log entry, you’ll see the exact commit the attestation came from.
It doesn't seem to be from what I can see. Only states that the upload came from a gh runner.
See adjacent comment above.
Ok that's at least something.
But my CI can download and run code from everywhere, so that doesn't mean that I can know what is being uploaded just looking at the git repository alone.
You need to rely on one of the four trusted publishers. You can't do it yourself: https://docs.pypi.org/trusted-publishers/adding-a-publisher/
I wonder why they aren't using SLSA attestations. It'd be good to have a single standard than every language having its own format.
This standard does support SLSA attestations. We’re enabling upload support for them pretty soon.
It looks like another process certain software engineers want to program and facilitate. I hope it is and stays optional.
I'm pessimistic about that.
[flagged]
Are tools like uv, rye, hatch, etc. going to facilitate this?
They could if there was a way to create an attestation that the tool (and you) can use.
Great, now all that is missing is a decent packaging system for Python.
I'm curious what would happen if a maintainer's PC is compromised. Is there any line of defense left at that point?
None.
Developer machine will have ssh keys and github tokens that can be used to push a commit on github, that will be built, signed, and uploaded on pypi.
That sounds like a gigantic attack surface then ...
I think since when they have 2FA PyPI is less secure.
Before I could learn my password and type it on twine. If my machine was stolen no upload on pypi was possible.
Now it's a token file on my disk so if my machine is stolen, then token can be used to publish.
Using github to publish doesn't change anything: if my machine is stolen the token needed to publish is still there, but instead of directly to pypi it will need to go via github first.
Tokens are a problem too (a yubikey might be a solution).
But an attacker could simply edit the source code on the maintainer's machine directly, and it could go unnoticed.
> Tokens are a problem too (a yubikey might be a solution).
Not the way they implement things. When they started forcing people to use 2FA, google also made a titan keys giveaway. But you set it up on your account, generate a token and that's it.
Identical situation on github. Setup 2FA with an hardware key, then generate a token and never use the hardware key ever again.
I doubt Yubikey would help without some fancy setup. 2FA is required to sign into PyPI but that's it. When PyPI rolled it out I thought you'd have to use 2FA every time you publish. I thought they were taking security seriously. But no, you get your API token, save it to your computer, forget about it, and you can publish your packages forever. Now you can have Github automatically publish your packages. That's not any improvement to security. My Google security key is just collecting dust.
Eh mine too. I got it because I had an essential package or whatever when they started forcing 2FA to people, and I thought twine would require me to authenticate with the key to be authorised to publish.
But they didn't touch twine at all. They just made me create a token and save it in a .txt file. That's it.
I'm now thinking about a system that can enforce that any line of code committed to the Git repo has been on the user's screen for at least X seconds. It could be a system completely isolated from the computer on which code is entered (e.g. via the HDMI cable).
How could you ever enforce that?
Your own source code audits (including just watching git logs), and or social code audits using crev.
https://github.com/crev-dev/
So, let me get this clear, after PyPA enshittifying the package expérience for more than a decade, and now mess has been clean by uv, the security FUD (and let not be naive here, the power struggle) move onto enshittifying PyPI and make Python user miserable for a decade more ?