Seth Michael Larson: PEP 770 Software Bill-of-Materials (SBOM) data from PyPI, Fedora, and Red Hat
This year I authoredPEP 770which proposed a new standardized location for Software Bill-of-Materials (SBOM) data within Python wheel archives. SBOM data can now be stored in.dist-info/sboms/. You can see thecanonical specificationonpackaging.python.org.
While writing this document we also reserved all subdirectory names under.dist-info/within a registry for future use in other standards. Reviewers agreed that this method of defining file-based metadata (such as SBOMs, but also licenses) is a great mechanism as it doesn't require creating a new metadata field and version.
Creating a new metadata field in particular requires large amounts of “head-of-line blocking” to rollout completely to an ecosystem of independent packaging installers, builders, publishers, and the Python Package Index; the proposed method side-steps all of this by making inclusion in the directory the mechanism instead.
So now that this PEP is published, what has happened since? A few things:
Unmasking the Phantom Dependency problem
In case you missed it, I published awhite paper on this projectwith Alpha-Omega. If you want to learn more about the whole project from end-to-end, this is a good place to start!
Auditwheel and cibuildwheel
Back in 2022 there was apublic issueopened for Auditwheel asking to generate an SBOM during theauditwheel repaircommand. Now inAuditwheel v6.5.0which was released in early November, Auditwheel willnow automatically generate SBOM dataand include the SBOM in the wheel following PEP 770.
The manylinux container imagesadopted the new auditwheel versionsoon after publication. These images are used by common Python wheel building platforms likecibuildwheelandmultibuild. Because this functionality was enabled by default we can look at Python wheel data and determine how many packages already supply PEP 770 SBOM data:
When querying the pypi-code.org dataset including all code within Python wheels I was able to find 332 projects on PyPI that are shipping SBOM data in their wheels:
Of these projects, these are the top-10 most downloaded with SBOM data so far:
There are far more projects which will likely require SBOM data on their bundled dependencies, so I'll continue watching the numbers grow over time!
RedHat and Fedora adopt PEP 770
Back in July of this year, Miro Hrončokasked if there was a mechanism for specifying the "origin" of a package, as many tools incorrectly assume that any package that's installed to an environment originated from the Python Package Index (and therefore would use a Package URLs likepkg:pypi/...). Their use-case was Python packages provided by the system package manager, such asrpmon Fedora and RedHat Linux. Vulnerability scanners were incorrectly assuming packages likepipwere vulnerable as older versions ofpipare packaged, but with vulnerability patches backported and applied to older versions.
SBOMs to the rescue!Miroadopted PEP 770 for Fedora and RedHat Linuxto reduce false-positives in vulnerability scans by defining the actual correct Package URL for the installed package in the SBOM:
If scanners adopt this approach and other Linux distros do as well, there will be far fewer false-positives from scanning Python environments using those Linux distros. A win for everyone! Miro isasking for feedbackon this approach by consuming tools.
Thanks for keeping RSS alive! ♥
https://sethmlarson.dev/pep-770-sbom-data-from-pypi-fedora-and-redhat?utm_campaign=rss