Git Product home page Git Product logo

wheel-inspect's Introduction

Project Status: Active — The project has reached a stable, usable
state and is being actively developed. CI Status coverage MIT License

Site | GitHub | Issues | Changelog

Packaged projects for the Python programming language are distributed in two main formats: sdists (archives of code and other files that require processing before they can be installed) and wheels (zipfiles of code ready for immediate installation). A project's wheel contains the complete information about what modules, files, & commands the project installs, along with information about what other projects the project depends on, but the Python Package Index (PyPI) (where wheels are distributed) doesn't expose any of this information! This is the problem that Wheelodex is here to solve.

Wheelodex scans PyPI for wheel files, analyzes them, and stores & displays the results. The site allows users to view the complete metadata inside wheels, search for wheels containing a given Python module or file, browse or search for wheels that define a given command or other entry point, and even find out projects' reverse dependencies.

Note that, in order to save disk space, Wheelodex only records data on wheels from the latest version of each PyPI project; wheels from older versions are periodically purged from the database. Projects' long descriptions aren't even recorded at all.

Suggestions and pull requests are welcome.

wheel-inspect's People

Contributors

dependabot[bot] avatar jwodder avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

wheel-inspect's Issues

Give the CLI an option for outputting as an array and/or JSON Lines

Since the CLI will output multiple objects when given multiple wheels, it seems only natural that the output should be a JSON array, either by default or when given a certain command-line option. Moreover, there should probably be an option for outputting a stream of objects as JSON Lines.

Alternatively, perhaps inspecting more than one wheel at once with the CLI should be forbidden?

Public API is broken by v1.6.0

Hi,

Wheel is no longer exported by __init__.py, constituting a breaking change of the API. devpi-builder uses this library and imports Wheel; you can reproduce the failure introduced by v1.6.0 with python3 -m venv tmp_venv; tmp_venv/bin/python -m pip install devpi-builder; tmp_venv/bin/devpi-builder -h:

Traceback (most recent call last):
  File "tmp_venv/bin/devpi-builder", line 6, in <module>
    from devpi_builder.cli import main
  File "~/code/tmp_venv/lib/python3.7/site-packages/devpi_builder/cli.py", line 17, in <module>
    from devpi_builder import requirements, wheeler
  File "~/code/tmp_venv/lib/python3.7/site-packages/devpi_builder/wheeler.py", line 17, in <module>
    from wheel_inspect import Wheel, inspect_wheel
ImportError: cannot import name 'Wheel' from 'wheel_inspect' (~/code/tmp_venv/lib/python3.7/site-packages/wheel_inspect/__init__.py)

Add support for dist-info directories

If using PEP517 directly (or for other reasons), it's possible to create a directory with the wheel metadata without doing a full build.

Please support parsing this information?

Feature Request: Pass BytesIO to inspect_wheel

Hi jwodder,

This library looks pretty interesting, however I am having trouble with passing a Bytes Object instead of a path to inspect_wheel.

I would like to read a python wheel I uploaded, however based on what I am seeing in the code so far, it seems most things are dependent on a path.

Rewrite in Rust?

  • It would make all the (increasing complicated) error handling much, much easier.

  • There would need to be a Python extension wrapping the Rust library.

latest bleach breaks wheel2inspect

  File "/usr/local/bin/wheel2json", line 8, in <module>
    sys.exit(main())
  File "/usr/local/lib/python3.8/site-packages/wheel_inspect/__main__.py", line 11, in main
    about = inspect_wheel(path)
  File "/usr/local/lib/python3.8/site-packages/wheel_inspect/inspecting.py", line 178, in inspect_wheel
    return inspect(wf)
  File "/usr/local/lib/python3.8/site-packages/wheel_inspect/inspecting.py", line 147, in inspect
    about["derived"]["readme_renders"] = render(readme) is not None
  File "/usr/local/lib/python3.8/site-packages/readme_renderer/rst.py", line 102, in render
    return clean(rendered)
  File "/usr/local/lib/python3.8/site-packages/readme_renderer/clean.py", line 66, in clean
    cleaner = bleach.sanitizer.Cleaner(
TypeError: __init__() got an unexpected keyword argument 'styles'

Add a Path-like API for filetree navigation

Wheel/dist-info objects should have a filetrees (name WIP) attribute that is a dynamic mapping. Accessing this mapping with the keys "purelib", "platlib", "dist-info", "headers", "scripts", or "data" should return a PosixPath-like object for navigating through the respective subset of the files in the wheel. However — and this is key — it should not be possible to use such a path instance to access or view files outside the given filetree; this includes not being able to access the *.dist-info or *.data directories from the purelib tree (assuming root is purelib; otherwise, platlib).

  • The filetrees mapping should accept any name not containing a slash; unknown names are treated as subdirectories of *.data. If a given directory does not exist (or if the corresponding entry in *.data is a file instead of a directory), None is returned.
  • filetrees should accept a "root" (or None?) key to access purelib for purelib wheels and platlib for platlib wheels.
  • These path objects should be accepted by the API wherever wheel entry filepaths are accepted.
  • Path objects should have a wheel_path(?) attribute for getting the actual, full path of the underlying file inside the wheel.

wheel-inspect raises `MissingFieldError` for wheels modified by `wheel tags`

Context

wheel tags is a new feature from wheel 0.40.0 for wheel maintainers to tag a wheel, independent of the build system. It is extremely handy especially in the post-PEP 517 time where the build backend is no longer dominated by setuptools. Not all build backends allow the developer to easily tweak the wheel tags, and so the easiest way is to build a wheel in the most convenient fashion, and tag it afterwards.

However, wheel-inspect is outdated and not catching up with this new approach. The last update v1.7.1 was released in Apr 2022, whereas wheel tags is part of wheel 0.40.0 released in Mar 2023.

Symptom

Using wheel-inspect v1.7.1 on a wheel tagged via wheel tags, this is what we get

[root@b58bcbb2915e test]# wheel tags --platform-tag "manylinux2014_x86_64" cuquantum_python_cu11-23.6.0-11-cp39-cp39-linux_x86_64.whl
cuquantum_python_cu11-23.6.0-11-cp39-cp39-manylinux2014_x86_64.whl
[root@b58bcbb2915e test]# wheel2json cuquantum_python_cu11-23.6.0-11-cp39-cp39-manylinux2014_x86_64.whl
Traceback (most recent call last):
  File "/opt/conda/bin/wheel2json", line 8, in <module>
    sys.exit(main())
  File "/opt/conda/lib/python3.9/site-packages/wheel_inspect/__main__.py", line 12, in main
    about = inspect_wheel(path)
  File "/opt/conda/lib/python3.9/site-packages/wheel_inspect/inspecting.py", line 191, in inspect_wheel
    return inspect(wf)
  File "/opt/conda/lib/python3.9/site-packages/wheel_inspect/inspecting.py", line 121, in inspect
    about["dist_info"]["wheel"] = obj.get_wheel_info()
  File "/opt/conda/lib/python3.9/site-packages/wheel_inspect/classes.py", line 69, in get_wheel_info
    return parse_wheel_info(txtfp)
  File "/opt/conda/lib/python3.9/site-packages/wheel_inspect/wheel_info.py", line 53, in parse_wheel_info
    wi = infoparser.parse(fp)
  File "/opt/conda/lib/python3.9/site-packages/headerparser/parser.py", line 288, in parse
    return self.parse_stream(scan(iterable, **self._scan_opts))
  File "/opt/conda/lib/python3.9/site-packages/headerparser/parser.py", line 264, in parse_stream
    raise errors.MissingFieldError(hd.name)
headerparser.errors.MissingFieldError: Required header field 'Tag' is not present

where it complains the wheel tag is missing, but I do not think tags have to be present. PyPA has clearly stated that

The wheel built package format includes these tags in its filenames

so the assumption (which wheel-inspect relies on) that the WHEEL file always contains a Tag field is not really held.

Suggested fix

This line can be relaxed to use required=False:
https://github.com/jwodder/wheel-inspect/blob/bd2051e9de2edb88075b40763d430887ef15c26d/src/wheel_inspect/wheel_info.py#L10
If Tag is not found, then wheel-inspect can just use the wheel filename to fetch the tag.

Check zipfiles for "well-behaved-ness"

Refuse to proceed if given a zipfile where any of the following are true:

  • Zipfile has both path and path/
  • Entries in zip are encrypted (Can this be detected?)
  • File in zip twice (How does Python handle this?)
  • Directory has data (Don't care?)
  • Filename in zipfile contains a backslash?
  • Filename in zipfile contains a null
  • Filename in zipfile is absolute
  • Empty filename field ("if input came from standard input")
  • Filename in archive not UTF-8 (How does Python handle this?)
  • Path in zipfile not normalized

Add an option for skipping file digest verification

The inspection functions should take a verify_digests: bool = True parameter, and the command line should take a --no-verify-digests (or just --no-verify?) option, for controlling whether the wheel inspection should verify that the digests listed in the RECORD are correct.

codecov uploader stopped working 15 days ago

Current test runs show this error message:

[2022-05-11T06:16:34.578Z] ['error'] There was an error running the uploader: Error uploading to [https://codecov.io:](https://codecov.io/) Error: There was an error fetching the storage URL during POST: 400 - Bad Request - [ErrorDetail(string='Too many uploads to this commit.', code='invalid')]

On https://app.codecov.io/gh/jwodder/wheel-inspect the last uploaded test run was from 15 days ago:
https://codecov.io/gh/jwodder/wheel-inspect/commit/f95f1603710dc443cf885b7b864c9cd6f2df9449/

Detect more validation errors

Errors to detect:

  • project/version in filename/dist-info doesn't match METADATA
    • Be sure to take the filename's underscorification of - in the project name and ! & + in the version into account
  • tags in filename don't match those in WHEEL
  • build tag in filename doesn't match that in WHEEL
  • invalid Python/ABI/platform tag
  • tags in filename not sorted?
  • version doesn't comply with PEP 440
  • METADATA:
    • unknown Metadata-Version?
    • missing required field
    • field has invalid value
    • non-multi-use field used multiple times
    • not at least version 1.1
  • WHEEL:
    • unknown Wheel-Version
    • missing required field
    • field has invalid value
    • non-multi-use field used multiple times
  • both zip-safe and not-zip-safe
  • filename in RECORD contains a backslash?
  • filename in RECORD contains a null?
  • *.data/ cannot contain both purelib and platlib; which one is allowed depends on whether Root-Is-Purelib
  • *.data/ contains a non-directory file? (ignore?)
  • *.dist-info file is not UTF-8

Expose the internal API publicly

  • Make the API worth exposing
  • Type-annotate everything
  • Document everything
  • Set up a Read the Docs site
  • Export classes for import directly from the wheel_inspect module

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.