dropbox / pyannotate Goto Github PK

View Code? Open in Web Editor NEW

1.4K 26.0 58.0 182 KB

Auto-generate PEP-484 annotations

License: Apache License 2.0

Python 100.00%

pyannotate's Introduction

PyAnnotate: Auto-generate PEP-484 annotations

Insert annotations into your source code based on call arguments and return types observed at runtime.

For license and copyright see the end of this file.

Blog post: http://mypy-lang.blogspot.com/2017/11/dropbox-releases-pyannotate-auto.html

How to use

Phase 1: Collecting types at runtime

Install the usual way (see "red tape" section below)
Add from pyannotate_runtime import collect_types to your test
Early in your test setup, call collect_types.init_types_collection()
Bracket your test execution between calls to collect_types.start() and collect_types.stop() (or use the context manager below)
When done, call collect_types.dump_stats(filename)

All calls between the start() and stop() calls will be analyzed and the observed types will be written (in JSON form) to the filename you pass to dump_stats(). You can have multiple start/stop pairs per dump call.

If you'd like to automatically collect types when you run pytest, see example/example_conftest.py and example/README.md.

Instead of using start() and stop() you can also use a context manager:

collect_types.init_types_collection()
with collect_types.collect():
    <your code here>
collect_types.dump_stats(<filename>)

Phase 2: Inserting types into your source code

The command-line tool pyannotate can add annotations into your source code based on the annotations collected in phase 1. The key arguments are:

Use --type-info FILE to tell it the file you passed to dump_stats()
Positional arguments are source files you want to annotate
With no other flags the tool will print a diff indicating what it proposes to do but won't do anything. Review the output.
Add -w to make the tool actually update your files. (Use git or some other way to keep a backup.)

At this point you should probably run mypy and iterate. You probably will have to tweak the changes to make mypy completely happy.

Notes and tips

It's best to do one file at a time, at least until you're comfortable with the tool.
The tool doesn't touch functions that already have an annotation.
The tool can generate either of:
- type comments, i.e. Python 2 style annotations
- inline type annotations, i.e. Python 3 style annotations, using --py3 in v1.0.7+

Red tape

Installation

This should work for Python 2.7 as well as for Python 3.4 and higher.

pip install pyannotate

This installs several items:

A runtime module, pyannotate_runtime/collect_types.py, which collects and dumps types observed at runtime using a profiling hook.
A library package, pyannotate_tools, containing code that can read the data dumped by the runtime module and insert annotations into your source code.
An entry point, pyannotate, which runs the library package on your files.

For dependencies, see setup.py and requirements.txt.

Testing etc.

To run the unit tests, use pytest:

pytest

TO DO

We'd love your help with some of these issues:

Better documentation.
Python 3 code generation.
Refactor the tool modules (currently its legacy architecture shines through).

Acknowledgments

The following people contributed significantly to this tool:

Tony Grue
Sergei Vorobev
Jukka Lehtosalo
Guido van Rossum

Licence etc.

License: Apache 2.0.
Copyright attribution: Copyright (c) 2017 Dropbox, Inc.
External contributions to the project should be subject to Dropbox's Contributor License Agreement (CLA): https://opensource.dropbox.com/cla/

pyannotate's People

Contributors

Stargazers

Watchers

pyannotate's Issues

Map 'file' to 'IO[bytes]'

Methods that take a file-like object end up being typed as file when mypy expects something like IO[bytes].

Fail to apply typeinfo to a python module in a subdirectory

The following will fail :

> python toto\toto.py 
( this generates a type_info.json in current directory )

>dir
[...]
25/07/2018  06:22    <DIR>          toto
25/07/2018  06:21               784 type_info.json

>pyannotate -3 toto\toto.py --type-info type_info.json
No files need to be modified.
NOTE: this was a dry run; use -w to write files

strange, there are type annotations in type_info.json with the correct path

>type type_info.json
[
    {
        "path": "toto\\toto.py",
        "line": 2,
        "func_name": "add",
        "type_comments": [
            "(*int) -> int",
            "(*List[int]) -> List[int]",
            "(*Tuple[int, int]) -> Tuple[int, int]"
        ],
        "samples": 3
    },
    {
        "path": "toto\\toto.py",
        "line": 8,
        "func_name": "add2",
        "type_comments": [
            "(Tuple[int, int], Tuple[int, int]) -> Tuple[int, int, int, int]",
            "(List[int], List[int]) -> List[int]",
            "(int, int) -> int"
        ],
        "samples": 3
    },
    {
        "path": "toto\\toto.py",
        "line": 11,
        "func_name": "main",
        "type_comments": [
            "() -> None"
        ],
        "samples": 1
    }
]

edit the type_info.json to remove the "toto\"

>type type_info.json
[
    {
        "path": "toto.py",
        "line": 2,
        "func_name": "add",
        "type_comments": [
            "(*int) -> int",
            "(*List[int]) -> List[int]",
            "(*Tuple[int, int]) -> Tuple[int, int]"
        ],
        "samples": 3
    },
    {
        "path": "toto.py",
        "line": 8,
        "func_name": "add2",
        "type_comments": [
            "(Tuple[int, int], Tuple[int, int]) -> Tuple[int, int, int, int]",
            "(List[int], List[int]) -> List[int]",
            "(int, int) -> int"
        ],
        "samples": 3
    },
    {
        "path": "toto.py",
        "line": 11,
        "func_name": "main",
        "type_comments": [
            "() -> None"
        ],
        "samples": 1
    }
]

try again

>pyannotate -3 toto\toto.py --type-info type_info.json
Refactored toto\toto.py
--- toto\toto.py        (original)
+++ toto\toto.py        (refactored)
@@ -1,14 +1,18 @@
+from typing import Any
+from typing import List
+from typing import Tuple
+from typing import Union

-def add(*args):
+def add(*args: Any) -> Union[List[int], Tuple[int, int], int]:
     ret = args[0]
     for v in args:
         ret += v
     return v

-def add2(v1, v2):
+def add2(v1: Union[List[int], Tuple[int, int], int], v2: Union[List[int], Tuple[int, int], int]) -> Union[List[int], Tuple[int, int, int, int], int]:
     return v1+v2

-def main():
+def main() -> None:
     print( add(1,2,3) )
     print( add([1,2], [3,4]) )
     print( add((1,2), (3,4)) )
Files that need to be modified:
toto\toto.py
NOTE: this was a dry run; use -w to write files

it worked...

It looks like pyannotate is trimming directories from type_info.json too agressively.

Is Async supported?

When I ran the script on my codebase, I noticed that none of my async functions were classified.

Are they supposed to be supported, or you have the support of them in your timeline by change?

Thank you #

Possible naming improvements

I was a bit surprised when approaching the package by the complexity of the naming.

What surprised me :

the tool is named pyannotate but you must import pyannotate_runtime to use it
collect_types.init_types_collection() ? Does the name really need to be that long ? Do we really need to explicitly init by the way ? Why not initialize on first resume() ?
dump_stats() is ok but dump() is good too

Suggestions :

allow to import directly from pyannotate
3 functions only : start(), stop(), dump()
initialize on first start() or rename init_types_collection() to init()

By the way, I am looking into contributing to pyannotate.

Crash in collect_types _removeHandlerRef

With the workaround from #13 applied I get this error:

  File "/usr/lib/python3.5/logging/__init__.py", line 725, in _removeHandlerRef
  File "/usr/local/lib/python3.5/dist-packages/pyannotate_runtime/collect_types.py", line 681, in _trace_dispatch
AttributeError: 'NoneType' object has no attribute 'get'

I suppose that the check is slightly broken there. This fixes it for me:

    key = id(code)
    if not sampling_counters: # add this if to prevent the crash
        return
    n = sampling_counters.get(key, 0)
    if n is None:
        return

After this an #13 collection of types works in Python 3.5 on Ubuntu 16.04.

Might be some things to learn from paulross/typin

Here is a similar project with some of the same goals as pyannotate: https://github.com/paulross/typin

There might be something useful in there for you. Injecting type aware documentation strings for example: https://github.com/paulross/typin/blob/master/src/typin/types.py#L537

Now that I have discovered pyannotate I doubt I will progress typin as it was just a side project!

Creates "from main" import for main program

Given t-pyannotate.py:

from pyannotate_runtime import collect_types


collect_types.init_types_collection()
collect_types.start()


class G:
    pass


def c(obj):
    pass


c(G())

collect_types.stop()
collect_types.dump_stats("type_info.json")

Running and writing it produces the following diff, where it tries to import G from __main__ then:

--- t-pyannotate.py     (original)
+++ t-pyannotate.py     (refactored)
@@ -1,4 +1,5 @@
 from pyannotate_runtime import collect_types
+from __main__ import G


 collect_types.init_types_collection()
@@ -10,6 +11,7 @@


 def c(obj):
+    # type: (G) -> None
     pass

type_info.json:

[
    {
        "path": "t-pyannotate.py",
        "line": 8,
        "func_name": "G",
        "type_comments": [
            "() -> None"
        ],
        "samples": 1
    },
    {
        "path": "t-pyannotate.py",
        "line": 12,
        "func_name": "c",
        "type_comments": [
            "(__main__.G) -> None"
        ],
        "samples": 1
    }
]

pyannotate b7f96ca (current master).

Traceback when type_info.json not found

E.g.

$ pyannotate foo
Traceback (most recent call last):
  File "/usr/local/bin/pyannotate", line 11, in <module>
    sys.exit(main())
  File "/usr/local/lib/python3.4/site-packages/pyannotate_tools/annotations/__main__.py", line 92, in main
    data = generate_annotations_json_string(infile)  # type: List[Any]
  File "/usr/local/lib/python3.4/site-packages/pyannotate_tools/annotations/main.py", line 51, in generate_annotations_json_string
    items = parse_json(source_path)
  File "/usr/local/lib/python3.4/site-packages/pyannotate_tools/annotations/parse.py", line 102, in parse_json
    with open(path) as f:
FileNotFoundError: [Errno 2] No such file or directory: 'type_info.json'
$

This should give a proper error message e.g. "Raw type info file 'type_info.json' not found" instead of a traceback.

Skip functions whose name starts with '<'

Those include lambdas, generator functions and comprehensions, and possibly more (the code indicates <module> as a possible name that's filtered out in a later stage).

KeyError in collect_types prep_args

Annotation of the following method:

def method():
    d = {1: {1: 2}}
    return {
        i: {
            (i, k): l
            for k, l in j.iteritems()
        }
        for i, j in d.iteritems()
    }

fails with

file.py:25: in test_pyannotate_mocks
    method()
file.py:16: in method
    for i, j in d.iteritems()
file.py:12: in <dictcomp>
    i: {
pyannotate_runtime/collect_types.py:738: in _trace_dispatch
    resolved_types = prep_args(arg_info)
pyannotate_runtime/collect_types.py:475: in prep_args
    resolved_type = resolve_type(arg_info.locals[arg])
E   KeyError: 'j'

Not sure if it's a dupe of #13, but callstack looks pretty similar

Crash in collect_types prep_args

When collecting types on Ubuntu 16.04 with Python 3.5 on the partially annotated Mission Pinball Framework codebase (https://github.com/missionpinball/mpf/), I get this error:

  File "/usr/local/lib/python3.5/dist-packages/pyannotate_runtime/collect_types.py", line 719, in _trace_dispatch
    resolved_types = prep_args(arg_info)
  File "/usr/local/lib/python3.5/dist-packages/pyannotate_runtime/collect_types.py", line 474, in prep_args
    resolved_type = resolve_type(arg_info.locals[arg])
KeyError: 'source'

Changing if not isinstance(arg, (list, dict)): to if not isinstance(arg, (list, dict)) and arg in arg_info.locals: fixes this for me. Don't know it that is a proper solution.

Pyannotate crashes when type_info.json has info about junitxml.py

Running pyannotate with the following json file:

[
    {
        "type_comments": [
            "(py._xmlgen.system-err) -> None", 
            "(py._xmlgen.system-out) -> None"
        ], 
        "path": "<path_python_lib>/junitxml.py", 
        "line": 78, 
        "samples": 5, 
        "func_name": "_NodeReporter.append"
    }
]

fails with

Traceback (most recent call last):
  File "/usr/local/bin/pyannotate", line 11, in <module>
    sys.exit(main())
  File "/Library/Python/2.7/site-packages/pyannotate_tools/annotations/__main__.py", line 45, in main
    generate_annotations_json(infile, tf.name)
  File "/Library/Python/2.7/site-packages/pyannotate_tools/annotations/main.py", line 37, in generate_annotations_json
    arg_types, return_type = infer_annotation(item.type_comments)
  File "/Library/Python/2.7/site-packages/pyannotate_tools/annotations/infer.py", line 38, in infer_annotation
    arg_types, return_type = parse_type_comment(comment)
  File "/Library/Python/2.7/site-packages/pyannotate_tools/annotations/parse.py", line 196, in parse_type_comment
    return Parser(comment).parse()
  File "/Library/Python/2.7/site-packages/pyannotate_tools/annotations/parse.py", line 205, in __init__
    self.tokens = tokenize(comment)
  File "/Library/Python/2.7/site-packages/pyannotate_tools/annotations/parse.py", line 188, in tokenize
    raise ParseError(original)
pyannotate_tools.annotations.parse.ParseError: Invalid type comment: (py._xmlgen.system-err) -> None

Please note I'm not trying to annotate junitxml.py. pyannotate just always fails if json file has it

Add an option to pyannotate to dump raw annotations without trying to edit the file

We've got an internal use case where lib2to3 can't parse the target file (oh, okay, it's pyxl :-). The info in type_info.json would still be useful. Maybe we can add a flag to just dump the annotations in a nicer format than having to read type_info.json by hand.

API to reset all; API to wrap all

The init_types_collection() function installs the tracing callbacks and sets the filename filter, but doesn't reset the globals containing collected data. Maybe it should also do that. Or maybe we need a separate API to do that. (Currently the pyannotate unit tests reset a whole bunch of globals at the start of each test -- this should become a single call.)

Second, with collect_types.collect() only calls resume() and pause() -- we could use another context manager that calls init_types_collection() and dump_stats() as well.

(Note that the non-context-manager APIs are also important, e.g. for use in test fixtures, where there are traditionally separate setUp() and tearDown() methods.)

PyAnnotate ignores existing type annotations for partially annotated functions

If a function's arguments are annotated, but return type is missing, for example:

def foo(
    arg1,  # type: Optional[int]
    arg2,  # type: Optional[str]
):
    return "result"

PyAnnotate ignores existing annotations, and adds a separate one:

def foo(
    arg1,  # type: Optional[int]
    arg2,  # type: Optional[str]
):
    # type: (None, None) -> str
    return "result"

Moreover, if we have more arguments,

def foo(
    argument1,  # type: Optional[int]
    argument2,  # type: Optional[str]
    argument3,  # type: Optional[str]
    argument4,  # type: Optional[str]
    argument5,  # type: Optional[str]
    argument6,  # type: Optional[str]
):
    return "result"

is annotated as

def foo(
    argument1,  # type: None  # type: Optional[int]
    argument2,  # type: None  # type: Optional[str]
    argument3,  # type: None  # type: Optional[str]
    argument4,  # type: None  # type: Optional[str]
    argument5,  # type: None  # type: Optional[str]
    argument6,  # type: None  # type: Optional[str]
    ):
    # type: (...) -> str
    return "result"

Infer number of callable arguments

For callable values, we could infer the number of arguments at least in simple cases where the callable takes a fixed number of positional arguments. The argument types could still be Any, but having the number of arguments would still be helpful.

Always add Optional[] when an arg has a default of None

Methods that have arg=None arguments are sometimes not marked as Optional because they are never passed a None parameter during the run. It would be nice if it could infer that.

Feature request: Conda package

It'd be cool to get this into conda or conda-forge to make it easier for projects to pick it up.

Resume/pause vs `with` statement

Bracket your test code between calls to collect_types.resume() and collect_types.pause()

How about a convenient with statement as an option? Also, how about an optional decorator that I can stick on top of my main()?

Infer base classes in signatures

If a function gets called with two different argument types A and B that have a common base class C, we may want to infer C as the argument type instead of Union[A, B] -- unless C is object or some other type that isn't interesting.

To implement this, we could record the MROs of all types in the profile.

Document how to run this via pytest

It would be great to know how to invoke this via normal pytest, which I believe would be one of the primary use cases. pytest does not provide a simple main().

Alternatively a pytest plugin would probably be even nicer, so you could just run:

pip install pytest-annotate
pytest --annnotate-types

`No files need to be modified.` when `init.py` exists

Running python 2.7.11 with pyannotate v1.0.3 installed in a virtualenv on ubuntu 16.0.4. I cloned this repo and ran the following

cd example
python driver.py
pyannotate gcd.py

This resulted in the expected

Refactored gcd.py
--- gcd.py	(original)
+++ gcd.py	(refactored)
@@ -1,8 +1,10 @@
 def main():
+    # type: () -> None
     print(gcd(15, 10))
     print(gcd(45, 12))
 
 def gcd(a, b):
+    # type: (int, int) -> int
     while b:
         a, b = b, a%b
     return a
Files that need to be modified:
gcd.py
NOTE: this was a dry run; use -w to write files

If I then create an __init__.py file, and run pyannotate gcd.py, I get

No files need to be modified.
NOTE: this was a dry run; use -w to write files

which is not what I expect to get. This is preventing me from running across my project's codebase. The type_info.json file is created correctly. It's the pyannotate step that fails to identify any changes.

pipdeptree output showing what is installed

  - mypy-extensions [required: Any, installed: 0.3.0]
  - six [required: Any, installed: 1.11.0]
  - typing [required: >=3.5.3, installed: 3.6.4]

ParseError: Invalid type comment: (main:f.<locals>.A) -> None

Given t-pyannotate.py:

from pyannotate_runtime import collect_types


collect_types.init_types_collection()
collect_types.start()


def c(obj):
    pass


def f():
    class A:
        pass

    c(A())


f()

collect_types.stop()
collect_types.dump_stats("type_info.json")

Produces:

[
    {
        "path": "t-pyannotate.py",
        "line": 8,
        "func_name": "c",
        "type_comments": [
            "(__main__:f.<locals>.A) -> None"
        ],
        "samples": 1
    },
    {
        "path": "t-pyannotate.py",
        "line": 12,
        "func_name": "f",
        "type_comments": [
            "() -> None"
        ],
        "samples": 1
    },
    {
        "path": "t-pyannotate.py",
        "line": 13,
        "func_name": "A",
        "type_comments": [
            "() -> None"
        ],
        "samples": 1
    }
]

Running pyannotate -w then crashes:

Traceback (most recent call last):
  File "…/Vcs/pytest/.venv/bin/pyannotate", line 11, in <module>
    load_entry_point('pyannotate', 'console_scripts', 'pyannotate')()
  File "…/Vcs/pyannotate/pyannotate_tools/annotations/__main__.py", line 125, in main
    data = generate_annotations_json_string(
  File "…/Vcs/pyannotate/pyannotate_tools/annotations/main.py", line 60, in generate_annotations_json_string
    signature = unify_type_comments(item.type_comments)
  File "…/Vcs/pyannotate/pyannotate_tools/annotations/main.py", line 27, in unify_type_comments
    arg_types, return_type = infer_annotation(type_comments)
  File "…/Vcs/pyannotate/pyannotate_tools/annotations/infer.py", line 45, in infer_annotation
    arg_types, return_type = parse_type_comment(comment)
  File "…/Vcs/pyannotate/pyannotate_tools/annotations/parse.py", line 216, in parse_type_comment
    return Parser(comment).parse()
  File "…/Vcs/pyannotate/pyannotate_tools/annotations/parse.py", line 225, in __init__
    self.tokens = tokenize(comment)
  File "…/Vcs/pyannotate/pyannotate_tools/annotations/parse.py", line 193, in tokenize
    raise ParseError(original)
pyannotate_tools.annotations.parse.ParseError: Invalid type comment: (__main__:f.<locals>.A) -> None

This is likely due to the "" in there I guess?

pyannotate b7f96ca (current master)

Infer argument and return types of callable values

It may be possible to infer the argument and return types of callable values in some cases. Here's an idea:

Record the identity of a callable object given as an argument to a function.
Record the identity of the called function when recording their argument/return types.
Match the two identities from above to infer the type of a callable object.

Crash on head when annotating

Works fine on v1.0.2. On head this crashes:

pyannotate mpf/core/switch_controller.py 
Traceback (most recent call last):
  File "/usr/local/bin/pyannotate", line 11, in <module>
    load_entry_point('pyannotate', 'console_scripts', 'pyannotate')()
  File "/data/home/jan/cloud/flipper/src/pyannotate/pyannotate_tools/annotations/__main__.py", line 56, in main
    show_diffs=not args.quiet)
  File "/usr/lib/python3.5/lib2to3/main.py", line 63, in __init__
    super(StdoutRefactoringTool, self).__init__(fixers, options, explicit)
  File "/usr/lib/python3.5/lib2to3/refactor.py", line 698, in __init__
    super(MultiprocessRefactoringTool, self).__init__(*args, **kwargs)
  File "/usr/lib/python3.5/lib2to3/refactor.py", line 210, in __init__
    self.pre_order, self.post_order = self.get_fixers()
  File "/usr/lib/python3.5/lib2to3/refactor.py", line 255, in get_fixers
    fixer = fix_class(self.options, self.fixer_log)
  File "/usr/lib/python3.5/lib2to3/fixer_base.py", line 58, in __init__
    self.compile_pattern()
  File "/usr/lib/python3.5/lib2to3/fixer_base.py", line 67, in compile_pattern
    PC = PatternCompiler()
  File "/usr/lib/python3.5/lib2to3/patcomp.py", line 50, in __init__
    self.grammar = driver.load_grammar(grammar_file)
  File "/usr/lib/python3.5/lib2to3/pgen2/driver.py", line 120, in load_grammar
    logger.info("Generating grammar tables from %s", gt)
  File "/usr/lib/python3.5/logging/__init__.py", line 1279, in info
    self._log(INFO, msg, args, **kwargs)
  File "/usr/lib/python3.5/logging/__init__.py", line 1414, in _log
    exc_info, func, extra, sinfo)
  File "/usr/lib/python3.5/logging/__init__.py", line 1384, in makeRecord
    sinfo)
  File "/usr/lib/python3.5/logging/__init__.py", line 269, in __init__
    if (args and len(args) == 1 and isinstance(args[0], collections.Mapping)
  File "/usr/lib/python3.5/abc.py", line 191, in __instancecheck__
    return cls.__subclasscheck__(subclass)
  File "/usr/lib/python3.5/abc.py", line 226, in __subclasscheck__
    if issubclass(subclass, scls):
  File "/usr/lib/python3.5/abc.py", line 226, in __subclasscheck__
    if issubclass(subclass, scls):
  File "/usr/lib/python3.5/abc.py", line 226, in __subclasscheck__
    if issubclass(subclass, scls):
  File "/usr/lib/python3.5/typing.py", line 1081, in __subclasscheck__
    return issubclass(cls, self.__extra__)
  File "/usr/lib/python3.5/abc.py", line 226, in __subclasscheck__
    if issubclass(subclass, scls):
[...many more repetitions...]  
  File "/usr/lib/python3.5/typing.py", line 1081, in __subclasscheck__
    return issubclass(cls, self.__extra__)
  File "/usr/lib/python3.5/abc.py", line 226, in __subclasscheck__
    if issubclass(subclass, scls):
  File "/usr/lib/python3.5/typing.py", line 1077, in __subclasscheck__
    if super().__subclasscheck__(cls):
  File "/usr/lib/python3.5/abc.py", line 197, in __subclasscheck__
    if subclass in cls._abc_cache:
RecursionError: maximum recursion depth exceeded

Map 'instance' (classic class) to 'object'

Some methods that took a classic class (in that case said class didn’t derive from object) ended up being typed as an instance parameter.

Add trailing commas to long form annotations?

Long form annotations currently omit the comma after the final argument, so they look like this:

def foo(arg1,  # type: int
        arg2  # type: int
       ):
    # type: (...) -> int
    return arg1 + arg2

Consider adding a trailing comma after arg2, unless it's *args or **kwds (for those, Python 2 doesn't allow it).

Please include license and examples in tarballs

Can be done by adding this to MANIFEST.in:

graft example
include CONTRIBUTING.md
include LICENSE

Permission denied for Temp file on Windows

When I run "pyannotate --type-info ./annotate.json ." with on Windows (both 2.7.14 and 3.6.3, probably others) , I get the following error:
Traceback (most recent call last):
File "c:\python27\lib\runpy.py", line 174, in run_module_as_main
"main", fname, loader, pkg_name)
File "c:\python27\lib\runpy.py", line 72, in run_code
exec code in run_globals
File "C:\Python27\Scripts\pyannotate.exe_main.py", line 9, in
File "c:\python27\lib\site-packages\pyannotate_tools\annotations_main.py", line 45, in main
generate_annotations_json(infile, tf.name)
File "c:\python27\lib\site-packages\pyannotate_tools\annotations\main.py", line 58, in generate_annotations_json
with open(target_path, 'w') as f:
IOError: [Errno 13] Permission denied: 'c:\temp\tmp2ui1ku'

A little bit of googling suggests this might be the problem Permission Denied To Write To My Temporary File

Type annotation does not with in Python 3.5

For some reason pyannotate with the workaround from #12 does not annotate my files:

PYTHONPATH=/usr/local/lib/python3.5/dist-packages/ pyannotate -w -v "mpf/core/data_manager.py"
Generating grammar tables from /usr/lib/python3.5/lib2to3/PatternGrammar.txt
Adding transformation: annotate_json
Refactoring mpf/core/data_manager.py
No changes in mpf/core/data_manager.py
No files need to be modified.

However, it definitely found some missing annotations in type_info.json:

    {   
        "func_name": "DataManager.__init__",
        "path": "mpf/core/data_manager.py",
        "type_comments": [
            "(mpf.tests.MpfTestCase.TestMachineController, str, int) -> None"
        ],
        "line": 18,
        "samples": 5
    },

And the code looks like this:

class DataManager(MpfController):

    """Handles key value data loading and saving for the machine."""

    def __init__(self, machine, name, min_wait_secs=1):
        [...]

Any idea what is going wrong here?

NameError: name 'method' is not defined

When using pyannonate to generate and apply annotations, one of the the return values I have is a method. Pyannotate generates the following return value:

# ... -> method

When using this return value, I get the following error:

NameError: name 'method' is not defined

I'm not sure exactly what this return type should be changed to. I can't point it at a specific method using single quotes, i.e.

# ... -> 'some_method'

Because it can return one of a number of methods.

So what would the correct value actually be?

Wrong/Strange paths in type_info.json

In the Mission Pinball Project we check out two repositories:
src/mpf/ (https://github.com/missionpinball/mpf/)
src/mpf-mc/ (https://github.com/missionpinball/mpf-mc/)

I ran type annotations from within src/mpf/. Most annotations look fine in type_info.json (there is actually an mpf folder below the other src/mpf):

    {
        "func_name": "Show._add_token",
        "path": "mpf/assets/show.py",
        "type_comments": [
            "(str, List[Union[int, str]], str) -> None"
        ],
        "line": 249,
        "samples": 250
    },

However, some annotations look strange:

    {   
        "func_name": "BcpProcessor",
        "path": "mc/mpfmc/core/bcp_processor.py",
        "type_comments": [
            "() -> None"
        ],
        "line": 15,
        "samples": 1
    },

There is no mc folder below src/mpf/. It refers to file in src/mpf-mc/ (there is a mpfmc folder in src/mpf-mc). However, I have no idea why this is "mc/mpfmc/" instead of "../mpf-mc/mpfmc/".

Some things we can learn from MonkeyType

Instagram released their competing tool, MonkeyType (blog, docs).

They have more configurability (e.g. type storage, sampling, type rewriting). They use randomization in their profiling hook.

A possibly road to convergence might involve sharing type storage and configuration so you can use either tool to collect types and the other to apply them.

Annotations are not generated for one-liner functions

This function will not have an annotation added:

def logtest(a, b, c=7, *var, **kw): return 7, a, b

Change it to have the return on the next line and it will work:

def logtest(a, b, c=7, *var, **kw):
    return 7, a, b

Python 3 code generation

We need a flag to generate Python 3 annotations instead of Python 2 style.

Compatibility issues with Python 3.5 because of typing stub

On Ubuntu 16.04 LTS pyannotate breaks when using Python 3.5 because is ships with a limited typing stub. Upgrading typing via pip does not help since the built-in takes precedence (know issue).

pyannotate -w "mpf/assets/show.py"
Traceback (most recent call last):
  File "/usr/local/bin/pyannotate", line 7, in <module>
    from pyannotate_tools.annotations.__main__ import main
  File "/usr/local/lib/python3.5/dist-packages/pyannotate_tools/annotations/__main__.py", line 9, in <module>
    from pyannotate_tools.annotations.main import generate_annotations_json
  File "/usr/local/lib/python3.5/dist-packages/pyannotate_tools/annotations/main.py", line 9, in <module>
    from pyannotate_tools.annotations.infer import infer_annotation
  File "/usr/local/lib/python3.5/dist-packages/pyannotate_tools/annotations/infer.py", line 8, in <module>
    from pyannotate_tools.annotations.parse import parse_type_comment
  File "/usr/local/lib/python3.5/dist-packages/pyannotate_tools/annotations/parse.py", line 12, in <module>
    from typing import Any, List, Mapping, Set, Text, Tuple
ImportError: cannot import name 'Text'

Can this be fixed? Maybe "hide" the typing import behind an if True?

This works as a workaround:
PYTHONPATH=/usr/local/lib/python3.5/dist-packages/ pyannotate

Set up Travis and AppVeyor projects

So tests are automatically run on a variety of platforms when PRs are submitted.

Low-overhead statistical sampling mode

Currently collecting types has a significant performance impact, and even rewriting type collection in C/Cython would only reduce it so much. It would be nice if the performance impact would be controllable in a way that the lower end would be, say, 1-2%. This would make it practical to collect types on production servers.

The motivation is that types collected during tests or manually running a program are unlikely to be complete, and during tests it's possible to have mocks and fakes that generate noise. By running in production on a large number of servers, it may be possible to easily collect a fairly complete picture of concrete runtime types, at least for more commonly used functions.

A potential approach is to run the type collector for roughly every N call events. At least the sampling logic would have to be implemented in C (or maybe Cython?) for acceptable performance. If N is large enough, the overhead would be dominated by the cost of invoking the profiling hook + a few machine instructions to decrement a counter and check the value.

It should be easy to validate the performance impact of the approach. The cost of collecting types isn't very important since we can make N large. However, at some point the collected types will be too sparse to be useful.

This issue doesn't cover how we'd aggregate types collected in multiple processes.

(The proposed approach is not my invention.)

pyannotate won't annotate methods with args-only keywords

Repro: Modify gcd.py in the example directory to the following:

def main():
    print(gcd(15, b=10))
    print(gcd(45, b=12))

def gcd(a, *, b):
    while b:
        a, b = b, a%b
    return a

and run

$ python driver.py
5
3
$ pyannotate -w gcd.py
No files need to be modified.
Warnings/messages while refactoring:
### In file gcd.py ###
gcd.py:6: source has 2 args, annotation has 3 -- skipping

Invalid type comment: (method-wrapper)

python version: Python 3.6.3 (CPython)
pyannonate version: 1.0.0

I get the following trace when I try to run the pyannonate CLI on a generated JSON file.

  File "/home/kura/.virtualenvs/blackhole/bin/pyannotate", line 11, in <module>
    sys.exit(main())
  File "/home/kura/.virtualenvs/blackhole/lib/python3.6/site-packages/pyannotate_tools/annotations/__main__.py", line 45, in main
    generate_annotations_json(infile, tf.name)
  File "/home/kura/.virtualenvs/blackhole/lib/python3.6/site-packages/pyannotate_tools/annotations/main.py", line 37, in generate_annotations_json
    arg_types, return_type = infer_annotation(item.type_comments)
  File "/home/kura/.virtualenvs/blackhole/lib/python3.6/site-packages/pyannotate_tools/annotations/infer.py", line 38, in infer_annotation
    arg_types, return_type = parse_type_comment(comment)
  File "/home/kura/.virtualenvs/blackhole/lib/python3.6/site-packages/pyannotate_tools/annotations/parse.py", line 196, in parse_type_comment
    return Parser(comment).parse()
  File "/home/kura/.virtualenvs/blackhole/lib/python3.6/site-packages/pyannotate_tools/annotations/parse.py", line 205, in __init__
    self.tokens = tokenize(comment)
  File "/home/kura/.virtualenvs/blackhole/lib/python3.6/site-packages/pyannotate_tools/annotations/parse.py", line 188, in tokenize
    raise ParseError(original)
pyannotate_tools.annotations.parse.ParseError: Invalid type comment: (method-wrapper) -> bool

@gvanrossum commented that this is an issue with the naming of an internal type -- #4 (comment)

The offending piece of my code that generates this internal type name is below.

def validate_option(self, key):
    """
    Validate config option is actually... valid...

    https://kura.github.io/blackhole/configuration.html#configuration-options

    :param str key: Configuration option.
    :raises ConfigException: When an invalid option is configured.
    """
    if key == '':
        return
    attributes = inspect.getmembers(self,
                                    lambda a: not(inspect.isroutine(a)))
    attrs = [a[0][1:] for a in attributes if not(a[0].startswith('__') and
             a[0].endswith('__')) and a[0].startswith('_')]
    if key not in attrs:
        valid_attrs = ('\'{0}\' and '
                       '\'{1}\'').format('\', \''.join(attrs[:-1]),
                                         attrs[-1])
        msg = ('Invalid configuration option \'{0}\'.\n\nValid options '
               'are: {1}'.format(key, valid_attrs))
        raise ConfigException(msg)

Specifically, the line that generates the method-wrapper type contains the inspect.getmembers call with a lambda as an argument.

attributes = inspect.getmembers(self, lambda a: not(inspect.isroutine(a)))

Which generates the following piece of JSON.

{
    "path": "blackhole/config.py",
    "line": 247,
    "func_name": "<lambda>",
    "type_comments": [
        "(str) -> bool",
        "(method-wrapper) -> bool",
        "(builtin_function_or_method) -> bool",
        "(blackhole.utils.Singleton) -> bool",
        "(bool) -> bool",
        "(method) -> bool",
        "(None) -> bool",
        "(Dict[str, Union[pathlib.PurePosixPath, str]]) -> bool"
    ],
    "samples": 954
},

Hope that helps.

mypy 0.620 signals an error on pyannotate

Using latest mypy 0.620 run on pyannotate triggers the following errors :

pyannotate_runtime/collect_types.py:375: error: The type alias is invalid in runtime context
pyannotate_runtime/collect_types.py:379: error: The type alias is invalid in runtime context
pyannotate_runtime/collect_types.py:380: error: The type alias is invalid in runtime context

See : https://travis-ci.org/dropbox/pyannotate/jobs/420859849

This is above my level of understanding of static type usage.

Exception ignored in: <async_generator object _ag at 0x7f9979703938>

Trying to collect type annotations, but I get this exception

Exception ignored in: <async_generator object _ag at 0x7f9979703938>
Traceback (most recent call last):
File ".../python/3.6.3/lib/python3.6/types.py", line 27, in _ag
File ".../.venv/lib/python3.6/site-packages/pyannotate_runtime/collect_types.py", line 752, in _trace_dispatch
File ".../.venv/lib/python3.6/site-packages/pyannotate_runtime/collect_types.py", line 698, in default_filter_filename
TypeError: startswith first arg must be str or a tuple of str, not NoneType

Runtime outputs illegal type comments

During a profiling run, the runtime sometimes outputs type comments that are illegal. The resulting output JSON file cannot be used as the --type-info parameter of pyannotate, as the type comments cannot be successfully parsed.

This is using Python 3.6 and pyannotate==1.0.2; full repro below.

Here are a few examples of the invalid type comments that are generated, and the corresponding parse errors:

{
    "path": "venv/lib/python3.6/site-packages/pytz/__init__.py",
    "line": 126,
    "func_name": "timezone",
    "type_comments": [
        "(str) -> pytz.tzfile.Europe/Amsterdam",
        "(str) -> pytz.tzfile.US/Eastern"
    ],
    "samples": 2
}

pyannotate_tools.annotations.parse.ParseError: Invalid type comment: (str) -> pytz.tzfile.Europe/Amsterdam
---------------------
{
    "path": "venv/lib/python3.6/site-packages/pytz/tzfile.py",
    "line": 26,
    "func_name": "build_tzinfo",
    "type_comments": [
        "(str, _io.BufferedReader) -> pytz.tzfile.Europe/Amsterdam",
        "(str, _io.BufferedReader) -> pytz.tzfile.US/Eastern"
    ],
    "samples": 2
}

pyannotate_tools.annotations.parse.ParseError: Invalid type comment: (str, _io.BufferedReader) -> pytz.tzfile.Europe/Amsterdam
---------------------
{
    "path": "venv/lib/python3.6/site-packages/pytz/tzinfo.py",
    "line": 166,
    "func_name": "DstTzInfo.__init__",
    "type_comments": [
        "(Tuple[datetime.timedelta, datetime.timedelta, str], Dict[Tuple[datetime.timedelta, datetime.timedelta, str], pytz.tzfile.US/Eastern]) -> None",
        "(None, None) -> pyannotate_runtime.collect_types.UnknownType"
    ],
    "samples": 5
}

pyannotate_tools.annotations.parse.ParseError: Invalid type comment: (Tuple[datetime.timedelta, datetime.timedelta, str], Dict[Tuple[datetime.timedelta, datetime.timedelta, str], pytz.tzfile.US/Eastern]) -> None

Seems like the / in the pytz type name is causing the problem.

To reproduce:

clone the GraphQL compiler project: https://github.com/kensho-technologies/graphql-compiler
check out the master branch, make a Python 3.6 virtualenv, install the project dependencies
pip install pytest-annotate, a plugin for pytest that will run PyAnnotate during tests
py.test --annotate-output ./annotations.json which will output PyAnnotate annotations into annotations.json
pyannotate --type-info ./annotations.json ./graphql_compiler/compiler/compiler_frontend.py, which will crash with the errors above

Some things we can learn from pytypes

I checked the profiling hook in pytypes, and they do two things we should also do:

save and restore the previously active profiler
in a thread, when the profiling callback is run after profiling has been stopped, reset the hook
(note: this should only be done after it's been stopped -- not after it's been paused)

Emit better error when a codec isn't supported

Use case: pyxl. Would be nice to get a sensible one-line warning about files containing pyxl rather than crazy crashes.

Hang in subprocess.Popen()

[UPDATE: If you have this problem, the solution is to pass close_fds=True to Popen()]

I've got a use case where a process being traced for type collection uses subprocess.Popen() to execute some helper program, and the Popen() call hangs at the line

data = _eintr_retry_call(os.read, errpipe_read, 1048576)

in Popen._execute_child(). (This is Python 2.7 on Mac, i.e. POSIX.)

That pipe has FD_CLOEXEC, so the child is not hitting the exec(). Presumably this is because it hangs in a Queue.put() operation in _trace_dispatch() (e.g. here).

I can think of a gross fix that monkey-patches os.fork to disable the profiling hook around the fork() so the child doesn't do this. But perhaps there's a more elegant solution (without using os.register_at_fork(), which is Python 3.7+ only)? Or the tracing hook could check the pid?

[UPDATE:] I can't repro this in a small test program. But it's real, and the os.fork monkey-patch fixes it. Not sure what to do about it yet, the monkey-patch seems risky.

Detect yield and return opcodes

There are some hacks in MonkeyType that detect yield and return opcodes -- the former to generate Generator/Iterator return annotations, the latter to distinguish between return and exceptions.

Alternative approach via stubfiles

This is not an issue. I just wanted to crosslink an approach with the same goal that might be of help for some use cases (and eventually save us all from doing work twice).

TypeLogger from pytypes can observe types at runtime and then write a PEP-484 stubfile from that information.
All versions of typing are supported and it runs on Python 2.7, 3.3, 3.4, 3.5, 3.6, PyPy and Jython.
See https://github.com/Stewori/pytypes#usage-example-with-profiler.

The approach differs from pyannotate in that it does not write into existing code, but always creates an external PEP-484 conform stubfile that should be suitable for use with mypy (pytypes itself can use it for runtime typechecking).

A nice goodie is that it can take existing type annotations into account and extend them by information acquired from runtime observations.
The tool supports OOP -- classes, inner classes, static methods, class methods and properties -- and automatically writes a proper import section for the types in use.

Disclaimer: It is not perfect yet and still in beta. Please file issues as they come up. Help is welcome!

Collecting types from mocks causes a crash

A user trying this in a test using mocks (in Python 2.7) reported crashes. She told me:

pyannotate tries to get __mro__ from MagicMock when it gets constructed.
MagicMock has 2 base classes — MagicMixin and Mock
get_function_name_from_frame calls getattr(inst.__class__, '__mro__', None)
Mock has __getattr__ method defined
which tries to get _mock_methods:

        if name in ('_mock_methods', '_mock_unsafe'):
            raise AttributeError(name)
        elif self._mock_methods is not None:
            if name not in self._mock_methods or name in _all_magics:

but _mock_methods has not been set yet
because it gets set later in

class MagicMixin(object):
    def __init__(self, *args, **kw):
        self._mock_set_magics()  # make magic work for kwargs in init
        _safe_super(MagicMixin, self).__init__(*args, **kw)
        self._mock_set_magics()  # fix magic broken by upper level init

get_function_name_from_frame() has bugs

It's possible to get two functions in the same file with the same name. For example

class CameraNotifier(NSObject):
    def init(self):
        def register(events, center):

causes a function named "CameraNotifier.register" to be recorded, which may conflict with an actual method of the same class with the same name.

(Note that usually this is resolved in pyannotate by looking at the line numbers.)