Comments (13)
Hi @osamuaoki , have you tried setting the readme.content-type
to text/plain
?
Sorry, wrong hint: this does not seem to solve the problem...
It is probably a "folding" issue with the syntax of the METADATA file. The effect of license.file
is to embedded all the text from the file into METADATA
. Omitting it relies on Setuptools using the standard rules to add the license file into the wheel as a separated file.
from sampleproject.
Yah, I was thinking along the same line. Despite the upload ERROR message initially gave me an impression, the real issue seems to be inclusion of LICENCE and its data conversion to fit it into METADATA.
setuptool
seems to accept only ReST compatible file. (Just impression. No serious test is done yet.)
flint
seems to ignore this license
entry in pyproject.toml
.
According to my vague memory, I tried something along:
license = { file="LICENSE.txt", content-type="text/plane" }
It didn't fix this issue.
I don't know if this is bug on build tool side or not. Important thing is to inform user for practical workaround for the moment and someone who understand situation to ask other packages to work with this situation etc.
from sampleproject.
I couldn't reproduce. After replacing the contents of LICENSE.txt
with the GPL text, I still get
$ twine check *
Checking sampleproject-3.0.0-py3-none-any.whl: PASSED
Checking sampleproject-3.0.0.tar.gz: PASSED
from sampleproject.
I cannot reproduce it either... This is what I did in detail:
# docker run --rm -it python:3.12-bookworm /bin/bash
cd /tmp
git clone https://github.com/pypa/sampleproject
cd sampleproject
wget https://www.gnu.org/licenses/gpl-3.0.txt -O LICENSE.txt
sed -i 's/name = "sampleproject"/name = "sampleproject-abc"/g' pyproject.toml
python -m pip install -U pip build twine
python -m build
twine check dist/*
# Checking dist/sampleproject-3.0.0-py3-none-any.whl: PASSED
# Checking dist/sampleproject-3.0.0.tar.gz: PASSED
twine upload --verbose --repository testpypi dist/*
# ...
# INFO Response from https://test.pypi.org/legacy/:
# 200 OK
# ...
# View at:
# https://test.pypi.org/project/sampleproject-abc/3.0.0/
unzip -p dist/sampleproject_abc-3.0.0-py3-none-any.whl sampleproject_abc-3.0.0.dist-info/METADATA > /tmp/METADATA.orig
head -n 20 /tmp/METADATA.orig
# Metadata-Version: 2.1
# Name: sampleproject-abc
# Version: 3.0.0
# Summary: A sample Python project
# Author-email: "A. Random Developer" <[email protected]>
# Maintainer-email: "A. Great Maintainer" <[email protected]>
# License: GNU GENERAL PUBLIC LICENSE
# Version 3, 29 June 2007
#
# Copyright (C) 2007 Free Software Foundation, Inc. <https://fsf.org/>
# Everyone is permitted to copy and distribute verbatim copies
# of this license document, but changing it is not allowed.
#
# Preamble
#
# The GNU General Public License is a free, copyleft license for
# software and other kinds of works.
#
# The licenses for most software and other practical works are designed
# to take away your freedom to share and change the works. By contrast,
wget https://test-files.pythonhosted.org/packages/c0/c0/7b07efedcdcfab3c2700ce7c23d7a737b4b2419f5917f43993c8635f320e/sampleproject_abc-3.0.0-py3-none-any.whl -O /tmp/sampleproject_abc-3.0.0-py3-none-any.whl
unzip -p /tmp/sampleproject_abc-3.0.0-py3-none-any.whl sampleproject_abc-3.0.0.dist-info/METADATA > /tmp/METADATA.download
diff /tmp/METADATA.*
# => no difference
# Further investigate if the escaping works well
# for the "email message" format used in METADATA:
cat <<EOF > /tmp/test-metadata.py
import sys
import email
with open(sys.argv[1], "rb") as fp:
message = email.message_from_binary_file(fp)
for field, text in (
("license", message["License"]),
("readme", message.get_payload()),
):
print(f"------- {field} --------")
lines = text.splitlines(keepends=True)
print("".join(lines[:3]))
print("...")
print("".join(lines[-3:]))
EOF
python /tmp/test-metadata.py /tmp/METADATA.orig
# ------- license --------
# GNU GENERAL PUBLIC LICENSE
# Version 3, 29 June 2007
#
#
# ...
# Public License instead of this License. But first, please read
# <https://www.gnu.org/licenses/why-not-lgpl.html>.
#
# ------- readme --------
# # A sample Python project
#
# ![Python Logo](https://www.python.org/static/community_logos/python-logo.png "Sample inline image")
#
# ...
# [rst]: http://docutils.sourceforge.net/rst.html
# [md]: https://tools.ietf.org/html/rfc7764#section-3.5 "CommonMark variant"
# [md use]: https://packaging.python.org/specifications/core-metadata/#description-content-type-optional
# ---- Eventhing seems to work well in the METADATA file ----
Anyway, I think this repository is not the best place to be discussing this issue.
Once the author has a complete minimal reproducer, the following possibilities might be investigated:
- Escaping problem, to be reported/investigated in the
pypa/setuptools
project (later, after the investigation is complete, it might also involvepypa/distutils
and/orpypa/wheel
). - Validation problem, to be reported/investigated in the
pypi/warehouse
project. - Difference between
twine check
and the checks inpypi
, to be discussed in thepypa/twine
and/orpypi/warehouse
projects.
However it is important to have a clear idea what item(s) in the list above is actually happening, before opening any new issue; and verify them with solid reproducers.
Please feel free to use the quick script in the example above, to check if the METADATA
file is malformed... It is by no means an exhaustive check though but maybe when added to twine check
is enough?
from sampleproject.
I should have been more explicit which LICENSE text I was using. I am using good old GPL2 text as https://github.com/osamuaoki/imediff/blob/main/LICENSE and renamed it to LICENSE.txt . This file is causing problem.
With your test case with GPL3 text, I also get no problem on Debian/stable:
$ wget https://www.gnu.org/licenses/gpl-3.0.txt -O LICENSE.txt
$ python3 -m build
$ twine check dist/*
Checking dist/sampleproject-3.0.0-py3-none-any.whl: PASSED
Checking dist/sampleproject-3.0.0.tar.gz: PASSED
So the problem is specific to the GPL2 text. Quotation styles are different.
My GPL2 file uses `foo'
Your GPL3 file uses "foo"
FYI I am using Debian 12/ stable as the base. This means Python3.11. (I also used the latest PIP packages under venv to be current)
from sampleproject.
Thank you for the reproducer.
This seems to be the same as pypa/setuptools#4033.
I would say the root of the problem is the \x0c
characters in the license text. Isn't it?
Quoting Paul Moore in the thread I linked above:
This is because neither the metadata spec nor the email standards really say what to do with control characters in header values, and the stdlib email package appears to handle it inconsistently.
The escaping goes wrong when pypa/wheel
tries to regenerated the metadata file by ingesting the file that pypa/setuptools
generates. Per se, the file generated in pypa/setuptools
parses correctly, but after being re-generated by pypa/wheel
, it does not.
Recently I submitted a PR to pypa/distutils
that will make the life of pypa/wheel
easier, and prevent this round-trip problem from happening all togheter. However pypa/distutils
has not been merged into pypa/setuptools
yet.
If the problem is indeed \x0c
, I recommend closing this issue since the problem is already being tracked in pypa/setuptools#4033.
As a workaround you can replace all the \x0c
characters in your license file with a \n
, or simply remove out the licence = ...
in your pyproject.toml
as you have previously suggested (the license file will still be included in the wheel as a standalone file, so you are still distributing the license).
from sampleproject.
I'm confused. The reported error is:
ERROR `long_description` has syntax errors in markup and would not be rendered on PyPI.
line 2: Warning: Block quote ends without a blank line; unexpected unindent.
WARNING `long_description_content_type` missing. defaulting to `text/x-rst`.
That's about the README, not the license, and it's saying that the README isn't valid Restructured text. It looks like whatever changes the OP made to the readme (or to the filename of the readme, because if it were still a .md
file it would be detected as Markdown) are the problem here.
from sampleproject.
I can confirm the problematic content is in the LICENSE text not in README.
Good old FormFeed aka. FF (0x0c) in LICENSE was the real root cause. By removing them, I get clean result with no ERROR nor WARNING. So the problem is in setuptools.
As for the unhelpful ERROR message which seems to imply the problem reside in README.md or content-type setting... I don't know how exactly it happens to say this but this error report message has some room to be improved to lead people to the root cause.
from sampleproject.
I'm confused. The reported error is:
Hi @pfmoore, I am trying to summarise my thought process below. Hopefully it will improve the understanding 🤞.
When the stdlib's email
functions are used to read the METADATA
files, they don't care if there is a \xc0
character... It seems to deal with it very well1. So that is why the PKG-INFO
inside tar.gz
sdist parses correctly.
However, when pypa/wheel
regenerates the METADATA
file from PKG-INFO
it replaces the \xc0
character with a \n
. If you happen to have in your license file a \n\xc0
or \xc0\n
sequence, this means that pypa/wheel
will generate a METADATA
file with \n\n
.
Now, in the email message format, the \n\n
sequence is used to mark the end of the headers section and the beginning of the body/payload. In the packaging standard, we use the body/payload of the email message for the README/long_description
.
So the error message is complaining about the README/long_description
, because part of the LICENSE
file is incorrectly being considered as the email message body/payload.
This is the same error discussed in pypa/setuptools#4033, for which pypa/distutils#213 should provide a good workaround (we have to wait until it lands on setuptools).
Footnotes
-
And/or at least the functions used for parsing in
twine
/PyPI. ↩
from sampleproject.
Ah, I see. Thanks - that's a lot more subtle than I realised.
from sampleproject.
Hi, looking at pypa/setuptools#4033 , problematic control codes are:
'0x0b': ('\x0b', ['', '']),
'0x0c': ('\x0c', ['', '']),
'0x1c': ('\x1c', ['', '']),
'0x1d': ('\x1d', ['', '']),
'0x1e': ('\x1e', ['', '']),
In other words
Oct Dec Hex Char
─────────────────────────────
013 11 0B VT '\v' (vertical tab) Ctrl-K
014 12 0C FF '\f' (form feed) Ctrl-L
034 28 1C FS (file separator) Ctrl-\
035 29 1D GS (group separator) Ctrl-]
036 30 1E RS (record separator) Ctrl-^
These needs to be removed from the original LICENSE file to make source code robust and portable to older platforms.
As for the following text in pyproject.toml:
# This is either text indicating the license for the distribution, or a file
# that contains the license
# https://packaging.python.org/en/latest/specifications/core-metadata/#license
license = {file = "LICENSE.txt"}
I suggest updating this section with:
# This is either text indicating the license for the distribution, or a file
# that contains the license
# For the better compatibility with the older tool chain, please remove control
# characters (0x0b=^K, 0x0c=^L, 0x1c=^\, 0x1d=^], 0x1e=^^) in the` LICENSE.txt`
# file if they exist. See https://github.com/pypa/setuptools/issues/4033 .
# https://packaging.python.org/en/latest/specifications/core-metadata/#license
license = {file = "LICENSE.txt"}
from sampleproject.
IMHO, we should just fix that bug and not document the workaround, because pip and build use build isolation by default and most people never change that setting, which should fix the problem without people needing to upgrade their toolchain as soon as a new release of wheel is out.
from sampleproject.
I see your point and agree.
Maybe leave this issue open until the new release of tool chain.
from sampleproject.
Related Issues (20)
- UnicodeDecodeError: 'charmap' codec can't decode byte 0x81 in position 3342: character maps to <undefined> HOT 1
- packet queue is empty, aborting HOT 2
- How do you test this? HOT 2
- Tools
- How to include non python configuration files HOT 1
- How to control spelling incorrection even if phonetically sound same HOT 1
- pyproject.toml: references a "python_requires" key that does not exist HOT 2
- help! Executing file is not working HOT 2
- sampleproject doesn't include files `setup.py` and `setup.cfg` (and maybe other) one of the latest published guides claims it should have HOT 1
- Unclear description of "readme" field in pyproject.toml HOT 2
- The package name HOT 3
- TypeError: unhashable type: 'dict' HOT 4
- Should development dependencies be included in the optional section?
- Error in documentation regarding Packaging and Distributing projects using Setuptools
- Stop advertising adding `wheel` as an unconditional PEP 517 build dependency HOT 6
- Should PyPUG editors have access to this project? HOT 3
- Remove Python version 3.7 to unblock tests HOT 1
- Update tox version? And tox best practices? HOT 2
- Flake8 configuration should not use select
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from sampleproject.