ilevkivskyi / com2ann Goto Github PK
View Code? Open in Web Editor NEWTool for translation type comments to type annotations in Python
License: MIT License
Tool for translation type comments to type annotations in Python
License: MIT License
We should support translating type comments like this:
x, y = ... # type: Tuple[int, int]
Note this may be not totally trivial in r.h.s. is a single expression (not a tuple). In this case we may need to add annotations before assignment, like this
x: int
y: int
x, y = v
When a function in source file contains both type annotation comment and docstring, but in wrong order, com2ann fails with SyntaxError
.
Expected behaviour:
How to reproduce:
Create file test.py:
class Klass:
def function(self, parameter):
"""Comment"""
# type: (str) -> str
return parameter
Call com2ann: com2ann test.py
Output:
File: test.py
SyntaxError in test.py
Here is what 2to3 does:
stdout
, but does not overwrite any files.-w
to make it overwrite the file. It also keeps a backup,-n
.-o <dir>
to write files to a different directory.--no-diffs
to suppress writing diffs to stdout.Add support for transforming
x: int
y: str = 'hi'
to
x = None # type: int
y = 'hi' # type: str
etc.
I expect that the copy of this utility in the core Python repo is very soon going to be out of date and forgotten. I propose to delete it from there so we don't have to deal with well-meaning core devs making uncoordinated changes there.
Currently this causes a crash on master:
bar = {} \
# type: SuperLongType[WithArgs]
We should probably remove the empty line after translation too, so it will be
bar: SuperLongType[WithArgs] = {}
Given...
from typing import Optional
from typing import List
class Pizza:
def __init__(self, ingredients=None):
# type: (Optional[List[str]]) -> None
if ingredients is None:
self.ingredients = []
else:
self.ingredients = ingredients
def __repr__(self):
# type: () -> str
return "This is a Pizza with %s on it" % " ".join(self.ingredients)
@classmethod
def pizza_salami(cls):
# type: () -> Pizza
return cls(ingredients=["Salami", "Cheese", "Onions"])
... using com2ann factory.py
, I receive the following code ...
from typing import Optional
from typing import List
class Pizza:
def __init__(self, ingredients: Optional[List[str]] = None) -> None:
if ingredients is None:
self.ingredients = []
else:
self.ingredients = ingredients
def __repr__(self) -> str:
return "This is a Pizza with %s on it" % " ".join(self.ingredients)
@classmethod
def pizza_salami(cls) -> Pizza:
return cls(ingredients=["Salami", "Cheese", "Onions"])
... which is no valid Python code any more as Pizza
is not yet defined.
Here, the fix is to add quotes around the Pizza return type in the factory method.
As I am brand new to Python typing, maybe there are some other ways to fix this problem.
Maybe this discussion is any help: python/typing#58
Potentially, we can translate this:
for i, j in foo(bar): # type: (int, int)
...
to something like this
i: int
j: int
for i, j in foo(bar):
...
typeshed already carries type annotations for many "legacy" Python modules. As they evolve it becomes harder and harder to keep them synchronized. This becomes a non-issue if the type annotation is already included with the upstream source code and thus the typeshed
is not needed at all.
It would be cool if com2ann
had a mode to merge the annotations from typeshed
into the source code of the module itself.
I ran into a few cases where there was previously a type comment that was then switched into a type annotation, but then I get a syntax error because that thing was not previously imported.
There are a few possible solutions:
Maybe these two behaviours could be controlled by a flag.
> python -VV
Python 3.7.4 (tags/v3.7.4:e09359112e, Jul 8 2019, 20:34:20) [MSC v.1916 64 bit (AMD64)]
> poetry add --dev com2ann
Using version ^0.1.1 for com2ann
Updating dependencies
Resolving dependencies...
Writing lock file
Package operations: 1 install, 0 updates, 0 removals
- Installing com2ann (0.1.1)
> com2ann tests\conftest.py
File: tests\conftest.py
INTERNAL ERROR while processing tests\conftest.py
Please report bug at https://github.com/ilevkivskyi/com2ann/issues
Traceback (most recent call last):
File "d:\dev\python37\lib\runpy.py", line 193, in _run_module_as_main
"__main__", mod_spec)
File "d:\dev\python37\lib\runpy.py", line 85, in _run_code
exec(code, run_globals)
File "D:\...\venv\Scripts\com2ann.exe\__main__.py", line 7, in <module>
File "d:\...\venv\lib\site-packages\com2ann.py", line 877, in main
translate_file(args.infile, args.outfile, options)
File "d:\...\venv\lib\site-packages\com2ann.py", line 819, in translate_file
python_minor_version=options.python_minor_version)
File "d:\...\venv\lib\site-packages\com2ann.py", line 772, in com2ann
feature_version=python_minor_version)
TypeError: parse() got an unexpected keyword argument 'type_comments'
>
It happens when ast.parse(...)
is being called with the type_comments
kwarg. ast.py
is at D:\dev\Python37\Lib\ast.py
.
This should guard against newly appeared forward references.
When assigning to a nonlocal or global variable, the translation contains syntax errors, because these variables cannot be annotated.
Example
class C:
global x
x = 2 # type: int
This translates to
class C:
global x
x: int = 2
Running this results in
File "t.py", line 3
x: int = 2
^
SyntaxError: annotated name 'x' can't be global
Similar result if x is nonlocal rather than global.
First of all, you need to find the global or nonlocal statement for the variable in its enclosing scope. Then you will know that the variable cannot be annotated at that point.
Next, decide how you want to handle it, and then implement it. Some possibilities are:
I am writing a package which performs analysis tasks related to scopes and namespaces in a python program. It is still a work in progress. However, there is already enough functionality that you could find useful, not only in fixing the present bug but for other tasks as well.
You can find it at mrolle45/scopetools. Please let me know if you find it interesting, and I'd like to work with you in incorporating my code into your code.
scopetools will create a tree of Scope objects representing the scopes found in the ast.Module tree. The Scope will give you the status of a variable name including which Scope actually owns it. It also points to the ast object for that scope, so you can traverse it. I plan to provide a method to traverse the ast omitting any nested scopes. That way, you can discover all the type comments in the module and the scopes that contain them.
For example this crashes:
def outer():
# type: () -> None
@wrapper
def inner():
# type: () -> None
pass
I could be wrong, but following the code, whatever is put in --python-minor-version
, as an int, is passed to ast.parse(feature_version=<>)
.
The doc of ast.parse
says:
Also, setting feature_version to a tuple
(major, minor)
will attempt to parse using that Python version’s grammar. Currently major must equal to 3. For example, setting feature_version=(3, 4) will allow the use of async and await as variable names. The lowest supported version is (3, 4); the highest issys.version_info[0:2]
.
I think we should be doing ast.parse(feature_version=(3, python_minor_version))
.
Can you confirm ?
Thank you !
Hi!
Skype?
pip install -v com2amm
Using pip 20.2.1 from /home/sebastian/Repos/ceph/src/pybind/mgr/venv38/lib/python3.8/site-packages/pip (python 3.8)
Non-user install because user site-packages disabled
Created temporary directory: /tmp/pip-ephem-wheel-cache-yrrlga4n
Created temporary directory: /tmp/pip-req-tracker-p88wxv0i
Initialized build tracking at /tmp/pip-req-tracker-p88wxv0i
Created build tracker: /tmp/pip-req-tracker-p88wxv0i
Entered build tracker: /tmp/pip-req-tracker-p88wxv0i
Created temporary directory: /tmp/pip-install-2skuyzet
1 location(s) to search for versions of com2amm:
* https://pypi.org/simple/com2amm/
Fetching project page and analyzing links: https://pypi.org/simple/com2amm/
Getting page https://pypi.org/simple/com2amm/
Found index url https://pypi.org/simple
Looking up "https://pypi.org/simple/com2amm/" in the cache
Request header has "max_age" as 0, cache bypassed
Starting new HTTPS connection (1): pypi.org:443
https://pypi.org:443 "GET /simple/com2amm/ HTTP/1.1" 404 13
Status code 404 not in (200, 203, 300, 301)
Could not fetch URL https://pypi.org/simple/com2amm/: 404 Client Error: Not Found for url: https://pypi.org/simple/com2amm/ - skipping
Given no hashes to check 0 links for project 'com2amm': discarding no candidates
ERROR: Could not find a version that satisfies the requirement com2amm (from versions: none)
ERROR: No matching distribution found for com2amm
Exception information:
Traceback (most recent call last):
File "/home/sebastian/Repos/ceph/src/pybind/mgr/venv38/lib/python3.8/site-packages/pip/_internal/cli/base_command.py", line 216, in _main
status = self.run(options, args)
File "/home/sebastian/Repos/ceph/src/pybind/mgr/venv38/lib/python3.8/site-packages/pip/_internal/cli/req_command.py", line 182, in wrapper
return func(self, options, args)
File "/home/sebastian/Repos/ceph/src/pybind/mgr/venv38/lib/python3.8/site-packages/pip/_internal/commands/install.py", line 324, in run
requirement_set = resolver.resolve(
File "/home/sebastian/Repos/ceph/src/pybind/mgr/venv38/lib/python3.8/site-packages/pip/_internal/resolution/legacy/resolver.py", line 183, in resolve
discovered_reqs.extend(self._resolve_one(requirement_set, req))
File "/home/sebastian/Repos/ceph/src/pybind/mgr/venv38/lib/python3.8/site-packages/pip/_internal/resolution/legacy/resolver.py", line 388, in _resolve_one
abstract_dist = self._get_abstract_dist_for(req_to_install)
File "/home/sebastian/Repos/ceph/src/pybind/mgr/venv38/lib/python3.8/site-packages/pip/_internal/resolution/legacy/resolver.py", line 339, in _get_abstract_dist_for
self._populate_link(req)
File "/home/sebastian/Repos/ceph/src/pybind/mgr/venv38/lib/python3.8/site-packages/pip/_internal/resolution/legacy/resolver.py", line 305, in _populate_link
req.link = self._find_requirement_link(req)
File "/home/sebastian/Repos/ceph/src/pybind/mgr/venv38/lib/python3.8/site-packages/pip/_internal/resolution/legacy/resolver.py", line 270, in _find_requirement_link
best_candidate = self.finder.find_requirement(req, upgrade)
File "/home/sebastian/Repos/ceph/src/pybind/mgr/venv38/lib/python3.8/site-packages/pip/_internal/index/package_finder.py", line 926, in find_requirement
raise DistributionNotFound(
pip._internal.exceptions.DistributionNotFound: No matching distribution found for com2amm
Removed build tracker: '/tmp/pip-req-tracker-p88wxv0i'
$ pip --version
pip 20.2.1 from /home/sebastian/Repos/ceph/src/pybind/mgr/venv38/lib/python3.8/site-packages/pip (python 3.8)
Looks like pip needs some new mandatory information? Might be related to https://pip.pypa.io/en/stable/user_guide/#changes-to-the-pip-dependency-resolver-in-20-2-2020
Hi @ilevkivskyi - I was wondering if you could release a new version on PyPI? It would be lovely to have the python-requires
metadata reflect actual compatibility, and the annotated-type-ignore support would also be great.
Example program:
def f(x: bool
) -> foo:
# type: (int) -> str
pass
Result of com2ann:
def f(x: bool: int
) -> str:
pass
The x: bool: int
is a SyntaxError.
In this situation, there are various ways to handle it:
So that one can just type com2ann myfile.py
.
When a command is called with multiple files (ex. com2ann f1.py f2.py
), then the first file (f1.py
) is chosen as output for both files.
This is especially problematic when pre-commit
hook is used.
For example:
def func(arg1: SomeVeryLongType1,
arg2: SomeVeryLongType2) -> Dict[str,
int]:
...
This is not a syntax error, but looks ugly.
We could support at least Google style and Numpydoc
Thanks for this tool, it's really useful :)
If you have code like this:
foo = object()
bar = (
# Comment which explains why this ignored
foo.quox # type: ignore[attribute]
) # type: Mapping[str, Distribution]
Then after running com2ann
you end up with:
foo = object()
bar: Mapping[str, int] = (
# Comment which explains why this ignored
foo.quox # type: ignore[attribute]
) # type: Mapping[str, int]
which mypy
then complains about due to the double signature.
The intermediate (explanatory) comment doesn't seem to be related, though the type: ignore
comment is.
I propose that you handle all type comments in the target program. That means with
and for
statements and all degrees of complexity in assignments. Comments that are ill-formed should be reported and left alone. Otherwise the comments will be stripped and the appropriate annotations inserted to replace them.
I have a package scopetools which can provide lots of help. It is still under development. I'd like to work with you in enhancing scopetools and integrating it into com2ann, so please let me know if you are interested. It can find all the type comments and their containing scopes, and the owning scopes for global and nonlocal names. It can split complex target assignments, or type comments, into simpler items (names, attributes, and subscripts).
Here's the basic strategy:
Every statement has a target (or multiple targets for an assignment) and a type. With multiple targets, the same type applies to each target. The type is the type comment, parsed as an expression. Annotating the target with the type is the same process as assigning the type to the target. That is, if the target is a packing (a tuple or list of subtargets), the subtargets are annotated with subtypes derived from iterating on the type, taking into account a possible starred subtarget. If the number of subtypes is wrong, this is a malformed type comment, just like an assignment statement. This is done recursively.
The result is that you have a set of (sub)targets and (sub)types, where each target is an ast.Name, ast.Subscript, or ast.Attribute.
Attribute and Subscript annotations can just be inserted before the original statement.
Name annotations are more complicated:
The target for a statement is as follows:
Assignment: target [ = target ... ] = value # type: typeexpr
Each target
is a target and annotated with the entiretypeexpr
.
Examples:
w, (x.a, (y[0], z)) = value # type: t1, (t2, (t3, t4))
. This annotates w, x.a, y[0], and z with t1, t2, t3, and t4, resp.
t = w, (x.a, (y[0], z)) = value # type: t1, (t2, (t3, t4))
. Same and annotates t with (t1, (t2, (t3, t4))).
For: for target in iterable: # type: typeexpr
target is the target
.
With: with context [ as target ] [ , context [as target] ... ]: # type: typeexpr
as target
clause, this statement is ignored.as target
clause, target is the target
.(target, target [ , target ... ])
. The tuple contains only the targets from all the as target
clauses that are present.Examples:
with c1 as t1, c2, c3 as t3.x: # type: type1, type3:
. This annotates t1
with type1
and t3.x
with type3
.
with c1 as t1, c2, c3: # type: type1:
. This annotates t1
with type1
.
with c1 as t1: # type: type1:
. This annotates t1
with type1
.
with c1 as t1: # type: type1, type2:
. This annotates t1
with (type1, type2)
. The tuple is not unpacked.
Here is the specification we should support (from PEP 484):
def add(a, b): # type: (int, int) -> int
return a + b
def embezzle(self, account, funds=1000000, *fake_receipts):
# type: (str, int, *str) -> None #note vararg
def load_cache(self):
# type: () -> bool #note self
def send_email(address, sender, cc, bcc, subject, body):
# type: (...) -> bool
def send_email(address, # type: Union[str, List[str]]
sender, # type: str
cc, # type: Optional[List[str]]
bcc, # type: Optional[List[str]]
subject='',
body=None # type: List[str]
):
# type: (...) -> bool
I propose to not be very strict about whitespace and accept less strict formatting.
According to the readme,
def apply(self, value, **opts):
# type: (str, **bool) -> str
...
is converted to
def apply(self, value: str, **opts: str) -> str:
...
I assume you did mean for opts to be **str
in the comment or maybe **opts: bool
in the annotation...
It would be much easier to use this tool if it accepted mutliple files as input. I imagine that in that case, -o
would be forbidden to avoid ambiguity.
I might do a PR, just wanted to check if it would be welcome before doing it.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.